1. 05 6月, 2023 2 次提交
    • W
      [bug fix] group norm backward (#54341) · d338b2f8
      wangzhen38 提交于
      d338b2f8
    • A
      optimize logsumexp in small data scale (#52952) · 93e1bb98
      Asthestarsfalll 提交于
      * optimize logsumexp in small data scale
      
      * fix
      
      * fix
      
      * add #pragma once
      
      * swith to use aligned_vector and support arbitrarily shape
      
      * fix store
      
      * fix store
      
      * refine for special cases
      
      * try
      
      * fix
      
      * update
      
      * fix
      
      * fix all_reduce
      
      * try
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      
      * fix rocm bug
      93e1bb98
  2. 02 6月, 2023 2 次提交
  3. 30 5月, 2023 1 次提交
    • Y
      [AMP] Reimplement check_nan_inf as check_numerics_kernel. (#52245) · 44bd5927
      Yiqun Liu 提交于
      * Reimplement the check_nan_inf function as check_numerics kernel.
      
      * Remove the cpu implemention to phi.
      
      * Add ifdef for the including of omp.h.
      
      * Move the use of FLAGS_check_nan_inf_level out of header file.
      
      * Implement a common PrintAndThrowError function.
      
      * Fix the error using of __NVCC__, which should be instead with __CUDA_ARCH__.
      
      * Add dependency of phi.
      
      * Polish codes and unittest.
      44bd5927
  4. 26 5月, 2023 1 次提交
    • Y
      [PHI Decoupling]Create PHI shared lib (#53735) · da50a009
      YuanRisheng 提交于
      * create phi so
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * add file
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * perfect so
      
      * fix py3 bugs
      
      * delete all static target in phi
      
      * fix windows bugs
      
      * fix py3 bugs
      
      * fix ci bugs
      
      * fix windows bugs
      
      * fix bugs: gflags can't be linked by dynamic and static lib
      
      * fix bugs that can not load 3rd party
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix conflict
      
      * fix xpu bugs
      
      * fix mac compile bugs
      
      * fix psgpu bugs
      
      * fix inference failed
      
      * deal with conflict
      
      * fix LIBRARY_PATH bug
      
      * fix windows bugs
      
      * fix onednn error
      
      * fix windows compile bugs
      
      * fix windows compile bugs
      
      * fix test_cuda_graph_static_mode_error aborted
      
      * fix windows bugs
      
      * fix mac-python3 error
      
      * fix hip compile bugs
      
      * change mode to static
      
      * change to static mode
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix bugs
      
      * add static flag
      
      * add PADDLE_API
      
      * change position of PADDLE_API
      
      * fix windows bugs
      
      * change mode to dynamic lib
      
      * fix windows static bugs
      
      * deal with conflict
      
      * fix windows unit bug
      
      * fix coverage
      
      * deal with conflict
      
      * fix windows-inference
      
      * fix py3 bugs
      
      * fix bugs when compile type_info
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix windows openblas
      
      * fix xpu bugs
      
      * fix enforce_test in windows
      
      * update code according comment
      
      * fix windows cmake bug
      
      * fix windows bugs
      
      * fix windows bugs
      
      * delete cinn unittest
      
      * fix cinn bugs
      
      ---------
      Co-authored-by: HappyHeavyRain's avatarlzydev <1528794076@qq.com>
      da50a009
  5. 25 5月, 2023 1 次提交
  6. 24 5月, 2023 1 次提交
  7. 23 5月, 2023 2 次提交
  8. 22 5月, 2023 1 次提交
    • T
      Add multiclass_nms3 GPU kernel (#52401) · f71c805e
      Tian Zheng 提交于
      * Add GPU kernel for multiclass_nms3 op
      
      * Make multiclass_nms3 gpu kernel output consistent with cpu kernel
      
      * Fix API incompatibility
      
      * Fix unittests on builds without CUDA
      
      * Fix ROCM build
      
      * Remove fluid headers; Use default atol for unittest
      
      * Change function and variable naming
      
      * Add comments; Reduce redundant code
      
      * Use paddle test framework
      f71c805e
  9. 19 5月, 2023 3 次提交
  10. 18 5月, 2023 3 次提交
  11. 17 5月, 2023 1 次提交
  12. 16 5月, 2023 5 次提交
  13. 15 5月, 2023 3 次提交
  14. 12 5月, 2023 6 次提交
    • P
      【Hackathon 4 No.20】Add i0 / i0e to paddle (#52058) · ce256f75
      PommesPeter 提交于
      * added base code for i0 and i0e
      
      * added grad base code for i0 and i0e
      
      * added i0 and i0e python code
      
      * added ops and backward yaml config
      
      * added i0 and i0e cpu kernel, but not test.
      
      * added i0 and i0e code and unitest files
      
      * added test files
      
      * added i0/i0e gpu implementation code
      
      * updated code style
      
      * updated code style
      
      * fixed unitests code
      
      * updated i0 with eigen3
      
      * fixed bug and added more test cases
      
      * refactor: fixed static graph bug
      
      * refactor: removed i0 and i0e from op_compat
      
      * refactor: updated code style
      
      * refactor: updated op_compat.yaml
      
      * refactor: updated op_compat.yaml
      
      * refactor: fixed op name mapping and optimize unittest case
      
      * refactor: manually implement i0 / i0e
      
      * refactor: added grad kernel for i0 / i0e,didn't finish
      
      * Update math.py
      
      * refactor: added equation to doc in English and added comments for computing i0 / i0e gradient
      
      * refactor: removed eigen implementation
      
      * refactor: finished i0 / i0e cpu and gpu op
      
      * refactor: updated code style
      
      * fix: find  a bug but not fix
      
      * fix: incorrect unittest cases
      
      * update: updated code style and remove my file
      
      * update: updated unittest case
      
      * fix: fixed sign error
      
      * fix: fixed mistakes when merging
      
      * refactor: updated code style
      
      * refactor: remove unused code
      
      * refactor: updated code style
      ce256f75
    • L
      fix add_n kernel of large shape (#53749) · 4d39cc7f
      Leo Chen 提交于
      4d39cc7f
    • X
      【prim】add forward output for Silu grad signature (#53632) · 3846111d
      xiaoguoguo626807 提交于
      * add rules
      
      * modify silu_grad input
      
      * modify kernel signature
      
      * modify kernel signature
      
      * code style
      
      * review
      3846111d
    • W
      sequence_mask functionalization (#53478) · d2b1e3c2
      Wang Xin 提交于
      * sequence_mask functionalization
      
      * fix sequence_mask test
      d2b1e3c2
    • H
      move pow2_decay_with_linear_warmup kernel to phi (#53741) · 348565b0
      huangjiyi 提交于
      * update
      
      * update
      348565b0
    • fix er error msg of index_put Op (#53717) · 92db839f
      傅剑寒 提交于
      92db839f
  15. 10 5月, 2023 4 次提交
  16. 09 5月, 2023 1 次提交
  17. 08 5月, 2023 2 次提交
  18. 06 5月, 2023 1 次提交