1. 22 5月, 2023 5 次提交
    • M
      [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() (#53856) · 3794d171
      Meteor Liu 提交于
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * fixed cyclic reference that caused patial import
      
      * fixed bad change
      
      * fix bad import
      
      * fix bad import
      
      * fix bad import
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fixed usage of in_dynamic_mode() or in_dygraph_mode()
      
      * revert python3 to python in .pre-commit-config.yaml
      
      * fix merge conflicts
      3794d171
    • N
      Delete the chinese decription in ctest (#54018) · f7083f47
      niuliling123 提交于
      f7083f47
    • J
      fix device changed in setitem-numpy case (#53987) · ae35f502
      JYChen 提交于
      ae35f502
    • T
      Add multiclass_nms3 GPU kernel (#52401) · f71c805e
      Tian Zheng 提交于
      * Add GPU kernel for multiclass_nms3 op
      
      * Make multiclass_nms3 gpu kernel output consistent with cpu kernel
      
      * Fix API incompatibility
      
      * Fix unittests on builds without CUDA
      
      * Fix ROCM build
      
      * Remove fluid headers; Use default atol for unittest
      
      * Change function and variable naming
      
      * Add comments; Reduce redundant code
      
      * Use paddle test framework
      f71c805e
    • N
      Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT and backward... · d2fa26f6
      niuliling123 提交于
      Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT  and backward has nan/inf  (#52808)
      
      d2fa26f6
  2. 19 5月, 2023 5 次提交
    • warrentdrew's avatar
      add minimum grad composite rules (#52561) · 97690816
      warrentdrew 提交于
      * add minimum grad composite rules
      
      * add public python api
      
      * fix format
      
      * fix format
      
      * update testcase
      
      * fix testcase
      
      * fix format
      
      * fix cmakelist.txt
      
      * fix format
      
      * fix param problem
      
      * fix op and composite rule
      
      * fix bf16 cpu support problem
      
      * fix bf16 cpu issue
      
      * fix axis error log
      
      * add axis for maximum
      
      * revert commit
      
      * remove .orig
      
      * fix generic problem
      
      * revert max op
      
      * fix axis error
      
      * fix maximum axis
      
      * fix test_check_output
      
      * fix cinn
      
      * fix minimum maximum axis check
      97690816
    • L
      Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
      limingshu 提交于
      * Reorganize the forward codes of flash-attention.
      
      * Fix forward.
      
      * Remove some noused codes.
      
      * Simplify codes and fix backward.
      
      * Change all LOG(INFO) to VLOG and fix the backward.
      
      * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes
      
      * decrease the effect of debug print on performance
      
      * Unify the initialize of flashattn arguments.
      
      * Rewirte the reshape of temp_mask and temp_bias.
      
      * API support use_flash_attn.
      
      * Fix compiling error on CI.
      
      * Try to crop the flash-attention lib.
      
      * Correct the condition of whether can use flash-attn.
      
      * Remove the softmax_out argument.
      
      * Remove is_causal.
      
      * Polish codes.
      
      * Fix qkv_transpose_out's shape and scaling of Q * K.
      
      * Update commit of flash-attention.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      d29c1f8e
    • Z
      Add large dim test of log_softmax (#53954) · 1b6972fd
      Zhang Zheng 提交于
      * Add large dim test of log_softmax
      
      * fix
      1b6972fd
    • C
      fix meshgird and expand_as test (#53951) · 14e0ce71
      Charles-hit 提交于
      14e0ce71
    • D
      delete bf16 of cross entropy (#53922) · 69d3f4e3
      Danyang Zhang 提交于
      * delete bf16 of cross entropy
      
      * delete bf16 of cross entropy
      69d3f4e3
  3. 18 5月, 2023 6 次提交
  4. 17 5月, 2023 4 次提交
  5. 16 5月, 2023 14 次提交
  6. 15 5月, 2023 4 次提交
  7. 12 5月, 2023 2 次提交
    • P
      【Hackathon 4 No.20】Add i0 / i0e to paddle (#52058) · ce256f75
      PommesPeter 提交于
      * added base code for i0 and i0e
      
      * added grad base code for i0 and i0e
      
      * added i0 and i0e python code
      
      * added ops and backward yaml config
      
      * added i0 and i0e cpu kernel, but not test.
      
      * added i0 and i0e code and unitest files
      
      * added test files
      
      * added i0/i0e gpu implementation code
      
      * updated code style
      
      * updated code style
      
      * fixed unitests code
      
      * updated i0 with eigen3
      
      * fixed bug and added more test cases
      
      * refactor: fixed static graph bug
      
      * refactor: removed i0 and i0e from op_compat
      
      * refactor: updated code style
      
      * refactor: updated op_compat.yaml
      
      * refactor: updated op_compat.yaml
      
      * refactor: fixed op name mapping and optimize unittest case
      
      * refactor: manually implement i0 / i0e
      
      * refactor: added grad kernel for i0 / i0e,didn't finish
      
      * Update math.py
      
      * refactor: added equation to doc in English and added comments for computing i0 / i0e gradient
      
      * refactor: removed eigen implementation
      
      * refactor: finished i0 / i0e cpu and gpu op
      
      * refactor: updated code style
      
      * fix: find  a bug but not fix
      
      * fix: incorrect unittest cases
      
      * update: updated code style and remove my file
      
      * update: updated unittest case
      
      * fix: fixed sign error
      
      * fix: fixed mistakes when merging
      
      * refactor: updated code style
      
      * refactor: remove unused code
      
      * refactor: updated code style
      ce256f75
    • X
      【Prim】support higher order autodiff for dy2static+composite (#53171) · b73594b4
      Xiaoxu Chen 提交于
      * [Dy2St]Fix x grad names when high order gradient
      
      * Polish error msg
      
      * Add inputs var to backward in dy2st
      
      * Fix error
      
      * Get grad names for backward API
      
      * Fix save load
      
      * Polish code
      
      * Add ut
      
      * [prim] fix not support optional grad bugs in higher order autodiff
      
      * [prim] remove duplicate fill_any_like caused by infershape_for_composite
      
      * fix _strip_grad_suffix_ bugs in higher-order autodiff
      
      * [prim] create output for test_static_prim.cc
      
      ---------
      Co-authored-by: N0x45f <wangzhen45@baidu.com>
      b73594b4