1. 18 8月, 2023 1 次提交
  2. 16 8月, 2023 1 次提交
  3. 15 8月, 2023 1 次提交
  4. 14 8月, 2023 1 次提交
    • MarDino's avatar
      Add rmsnorm residual bias add and quant (#55965) · 2ac6a7e4
      MarDino 提交于
      * add rmsnorm residual bias add and quant
      
      * refine python interface
      
      * add rmsnorm unittest
      
      * Add layernorm
      
      * fix layernorm unittest
      
      * refine unittest
      
      * fix example code
      
      * fix review comment
      2ac6a7e4
  5. 10 8月, 2023 1 次提交
    • L
      Add variable_length_memory_efficient_attention (#55400) · 4036c937
      lzy 提交于
      * add variable_length_memory_efficient_attention
      * update variable_length_memory_efficient_attention unittest
      * update variable_length_mem_eff_attn's docs and unittest
      * update variable_length_mem_eff_attn's docs
      * Update test_variable_length_memory_efficient_attention.py
      * Update variable_length_memory_efficient_attention.cu
      * fix codestyle
      * fix variable_length_fmha's docs and unittest
      * fix variable_length_fmha's docs
      4036c937
  6. 08 8月, 2023 2 次提交
  7. 03 8月, 2023 2 次提交
  8. 26 7月, 2023 1 次提交
  9. 13 7月, 2023 1 次提交
  10. 12 7月, 2023 1 次提交
  11. 11 7月, 2023 2 次提交
  12. 03 7月, 2023 1 次提交
  13. 29 6月, 2023 1 次提交
    • N
      Add fused_rope forward op (#54351) · a215c46a
      niuliling123 提交于
      * style
      
      * more
      
      * update ctest
      
      * Update legacy_backward.yaml
      
      * Update legacy_ops.yaml
      
      * Update legacy_ops.yaml
      
      * update
      
      * update
      
      * update for move
      a215c46a
  14. 28 6月, 2023 1 次提交
  15. 26 6月, 2023 1 次提交
    • S
      remove ops from OpsWithFluidKernelNeedMoveToPhi set (#54007) · 733eca85
      Sonder 提交于
      * remove ops from OpsWithFluidKernelNeedMoveToPhi set
      
      * open static build flag
      
      * OpsWithFluidKernelNeedMoveToPhi
      
      * open new_executor_static_build
      
      * add infermate for cudnn_lstm
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * update
      
      * update
      
      * fix pow2 decay
      
      * fix pow2 decay
      
      * recover analysis_predictor.cc
      
      * fix pow2 decay
      
      * fix cudnn lstm
      
      * add output register info for svd
      
      * fix pow2_decay_with_linear_warmup_kernel
      
      * recover test lstm cudnn
      
      * recover svg register codes
      
      * fix register info
      
      * fix reduce sum register info
      
      * add output info for adadelta
      
      * add output info for adadelta
      
      * add output info for adamax
      
      * fix complex abs register info
      
      * add register info for cudnn_lstm_grad
      
      * recover
      
      * fix lstm cudnn
      
      * fix
      
      * fix xpu output registe info
      
      * remove std::cout
      
      * add backend
      
      * remove output info in pow2_decay_with_linear_warmup_kernel
      
      * add judgment in TensorShouldBeFakeInitialized
      
      * recover power_
      
      * close new_executor_static_build
      
      * fix set_value_xpu
      733eca85
  16. 16 6月, 2023 1 次提交
  17. 14 6月, 2023 1 次提交
  18. 01 6月, 2023 1 次提交
  19. 26 5月, 2023 1 次提交
    • Y
      [PHI Decoupling]Create PHI shared lib (#53735) · da50a009
      YuanRisheng 提交于
      * create phi so
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * add file
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * perfect so
      
      * fix py3 bugs
      
      * delete all static target in phi
      
      * fix windows bugs
      
      * fix py3 bugs
      
      * fix ci bugs
      
      * fix windows bugs
      
      * fix bugs: gflags can't be linked by dynamic and static lib
      
      * fix bugs that can not load 3rd party
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix conflict
      
      * fix xpu bugs
      
      * fix mac compile bugs
      
      * fix psgpu bugs
      
      * fix inference failed
      
      * deal with conflict
      
      * fix LIBRARY_PATH bug
      
      * fix windows bugs
      
      * fix onednn error
      
      * fix windows compile bugs
      
      * fix windows compile bugs
      
      * fix test_cuda_graph_static_mode_error aborted
      
      * fix windows bugs
      
      * fix mac-python3 error
      
      * fix hip compile bugs
      
      * change mode to static
      
      * change to static mode
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix bugs
      
      * add static flag
      
      * add PADDLE_API
      
      * change position of PADDLE_API
      
      * fix windows bugs
      
      * change mode to dynamic lib
      
      * fix windows static bugs
      
      * deal with conflict
      
      * fix windows unit bug
      
      * fix coverage
      
      * deal with conflict
      
      * fix windows-inference
      
      * fix py3 bugs
      
      * fix bugs when compile type_info
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix windows openblas
      
      * fix xpu bugs
      
      * fix enforce_test in windows
      
      * update code according comment
      
      * fix windows cmake bug
      
      * fix windows bugs
      
      * fix windows bugs
      
      * delete cinn unittest
      
      * fix cinn bugs
      
      ---------
      Co-authored-by: HappyHeavyRain's avatarlzydev <1528794076@qq.com>
      da50a009
  20. 23 5月, 2023 1 次提交
  21. 10 5月, 2023 1 次提交
    • add index_put api (#52886) · f3393f49
      傅剑寒 提交于
      * add index_put api
      
      * fix value broadcast in backward and add test case in static
      
      * add timeout=120s for index_put
      
      * add op_compat for index_put
      
      * add inplace index_put test
      
      * add test case when index tensor in indices is int32 when indices.size less than x.dims
      
      * add index_put api backward in cpu place
      
      * add backward test case
      
      * refactor code to delete some duplicated code
      
      * replace reshape with resize for decrease extra memcpy
      
      * add datatype flag in backward yaml
      
      * fix bug in documentation
      
      * Update python/paddle/tensor/manipulation.py
      
      ---------
      Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
      f3393f49
  22. 24 4月, 2023 1 次提交
  23. 22 4月, 2023 1 次提交
    • W
      [Zero-Dim] support output 0D for... · b406a7db
      wangfengsheng1999 提交于
      [Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase (#52850)
      
      * [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase
      
      * [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase
      
      * add test case
      
      * modify dot/metric.accuracy/static.accuracy/static.auc
      
      * modfiy inner/tensordot bug
      
      * test 9 api
      
      * [Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase
      
      * fix bug
      
      * support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy
      
      * code style
      
      * fix bug
      
      * fix test_dot_op bug
      
      * fix accuracy bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * codestyle
      
      * fix dot bug
      
      * fix dot bug
      
      * fix dot bug
      
      * code style
      
      * fix dot bug
      
      * fix dot bug
      
      * fix dot bug
      
      * fix dot bug
      
      * fix dot bug
      
      * fix dot bug
      
      * modify code
      b406a7db
  24. 21 4月, 2023 1 次提交
    • G
      【0D output】support 0D output for matrix_rank/multi_dot (#52861) · 47fa8066
      GGBond8488 提交于
      * support_0D_output_for_matrix_rank_multi_dot, test=allcase
      
      * add 0D output test for matrox_rank and mutli_dot test=allcase
      
      * fix assert error ,test=allcase
      
      * fix test error, test=allcase
      
      * fix other test error, test=allcase
      
      * fix other test error, test=allcase
      
      * fix test error, test=allcase
      
      * fix matrix_rank and multi dot test err test=allcase
      
      * fix test error test=allcase
      
      * fix test zero dim test, test=allcase
      
      * add static backward test for multi_dot, test=allcase
      
      * add tol 2d broadcast test case, test=allcase
      47fa8066
  25. 19 4月, 2023 1 次提交
  26. 17 4月, 2023 1 次提交
    • S
      Add output defs for some kernelsPhi register (#52941) · 23f87442
      Sonder 提交于
      * add register info for eigh and eig_gard
      
      * add sync_batch_norm_op.cu register info
      
      * add lamb output register info
      
      * add unique register info
      
      * change type name
      
      * change type name
      
      * add output register info for check_finite_and_unscale
      
      * update cmake and config file
      
      * add register info for adagrad
      
      * fix build error
      
      * add sync to run_unittests.sh
      
      * add register info for unique_consecutive
      
      * fix build error
      
      * add eigh to STATIC_BUILD_TESTS
      
      * update eig_kernel.cc
      
      * update eig_kernel.cc
      
      * fix infer mate error
      
      * fix unique register error
      
      * fix lamb register info error
      
      * fix lamb register info
      
      * update lamb register info
      
      * fix lamb
      
      * remove one Output Register
      
      * update static build file
      
      * add eigh op to disable_wingpu_test
      
      * update run_unittests
      23f87442
  27. 13 4月, 2023 1 次提交
    • H
      [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26
      HongyuJia 提交于
      * [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h
      
      * Add logging.h for profiler.cc
      
      * Add logging.h for gloo_utils.h
      
      * Add logging.h for addmm_kernel_impl.h
      
      * Add logging.h for addmm_grad_kernel_impl.h
      
      * Add logging.h for p_send_kernel.cu
      
      * Add logging.h for determinant_grad_kernel_impl.h
      
      * Add logging.h for p_recv_kernel.cu
      
      * Add logging.h for elementwise_grad_base.h
      
      * Add logging.h for transfer_layout_kernel.cc
      
      * Add logging.h for eigvals_kernel.cc and index_select_impl.h
      
      * Add logging.h for all files in kernel directory
      
      * Add logging.h for xpu_info.cc
      
      * Add logging.h for xpu
      5664ea26
  28. 11 4月, 2023 2 次提交
  29. 04 4月, 2023 1 次提交
  30. 30 3月, 2023 2 次提交
  31. 27 3月, 2023 2 次提交
  32. 24 3月, 2023 2 次提交
    • P
      [PHI]fix momentum dtype infer (#51353) · 648ec795
      PuQing 提交于
      * fix momentum dtype infer
      
      * fix momentum datatype
      
      * fix on cpu
      
      * add momentum
      648ec795
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  33. 23 3月, 2023 1 次提交
    • C
      [Prim] add meshgrid composite rule (#51061) · 53bb883d
      chenjian 提交于
      * add meshgrid composite rule
      
      * add meshgrid composite rule
      
      * update
      
      * add into CMakeLists
      
      * fix
      
      * update
      
      * update
      
      * optimize code
      
      * fix meshgrid op
      
      * update test
      53bb883d