1. 10 8月, 2023 1 次提交
    • L
      Add variable_length_memory_efficient_attention (#55400) · 4036c937
      lzy 提交于
      * add variable_length_memory_efficient_attention
      * update variable_length_memory_efficient_attention unittest
      * update variable_length_mem_eff_attn's docs and unittest
      * update variable_length_mem_eff_attn's docs
      * Update test_variable_length_memory_efficient_attention.py
      * Update variable_length_memory_efficient_attention.cu
      * fix codestyle
      * fix variable_length_fmha's docs and unittest
      * fix variable_length_fmha's docs
      4036c937
  2. 07 8月, 2023 1 次提交
    • Y
      Add attn_mask supported for FlashAttnKernel. (#55969) · 42e0c6b8
      yin wei 提交于
      * add mask
      
      * add backword
      
      * add enforce info
      
      * update scale
      
      * integrate code
      
      * update enforce
      
      * add enforce eq
      
      * add error type
      
      * update enforce
      
      * add test_flash_attention
      
      * Polish codes and fix compiling errors.
      
      * Set num_splits to 0 for flash-attn with tensor mask.
      
      * Fix the compiling error for non flash-attn case.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      42e0c6b8
  3. 04 8月, 2023 1 次提交
    • K
      [NewIR] Rename feed with place to data (#55778) · 274e5e54
      kangguangli 提交于
      * fix bug: feed_with_place should consider variable existence
      
      * fix
      
      * fix build scope
      
      * change method to set feed var name
      
      * remove feed_with_place to placeholder
      
      * fix
      
      * rename to data
      
      * fix
      
      * fix
      274e5e54
  4. 31 7月, 2023 2 次提交
  5. 27 7月, 2023 1 次提交
  6. 25 7月, 2023 1 次提交
    • H
      [NewIR]new ir dygraph to static supoort gpu (#55620) · fb9bec5d
      hong 提交于
      * add kernel dialect
      
      * change DenseTensorTypeStorage to DenseTensorType
      
      * add test case`
      
      * add first pd_op to kernel dialect
      
      * lower pd op to kernel dialect
      
      * update
      
      * update
      
      * remove useless code
      
      * add attrite print test
      
      * fix bug
      
      * update
      
      * update
      
      * update
      
      * update
      
      * polish code
      
      * fix bug
      
      * polish  code  and add python test
      
      * add test
      
      * fix test error
      
      * relax constraint when inserting get_parameter
      
      * add env flag
      
      * fix bug
      
      * dygraph2static support new ir
      
      * fix bug
      
      * revert test env
      
      * change cc_test_old to cc_test
      
      * update
      
      * fix build_static bug
      
      * update test
      
      * fix type test error
      
      * udpate cmake
      
      * disable test in windows
      
      * fix inference compile
      
      * fix program translator error
      
      * only run on cpu, not support gpu yet
      
      * fix conflict
      
      * polish code
      
      * fix bug
      
      * add feed with place op
      
      * update
      
      * remove useless unitest
      
      * udpate mkldnn
      
      * update
      
      * update
      
      * align mkldnn version
      
      * new ir support builtin slice op
      
      * fix bug
      
      * fix phi kernel adaptor bug
      
      * add enable static
      
      * add enable_static
      
      * remove useless test case
      
      * change feed list to single variable
      
      * update
      
      * add feed with place and shaddow output op
      
      * fix bug
      
      * remove usless code
      
      * support gpu
      
      * fix bug
      
      * fix bug
      
      * remove template
      
      * add more data type
      
      * fix cimpile bug
      
      * udpate
      
      * remove useless code
      
      * revert dygraph2st test
      
      * remove usless code
      
      * revert op
      
      * fix bug
      
      * new ir dygraph2static support gpu
      
      * remove usless code
      
      * code polish
      
      * add const
      
      * revert code and remove useless code
      
      * revert code
      
      * revert legacy op yaml
      
      * remove useless code
      
      * delete std::move
      
      ---------
      Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
      fb9bec5d
  7. 20 7月, 2023 1 次提交
  8. 19 7月, 2023 1 次提交
    • Z
      delete relu6_raw (#55383) · 56d46ccc
      zhangyuqin1998 提交于
      * delete relu6_raw
      
      * fix codestyle
      
      * Update test_mkldnn_matmul_activation_fuse_pass.py
      
      * fix
      
      * Update backward.yaml
      
      * Update ops.yaml
      
      * Update backward.yaml
      56d46ccc
  9. 18 7月, 2023 1 次提交
    • G
      batch add inpalce api (#55078) · 19302938
      GGBond8488 提交于
      * batch add inpalce api
      
      * fix inplace fn generate
      
      * add test for  new inpalce api
      
      * fix typro
      
      * fix typro
      
      * fix typro
      
      * fix test error
      
      * fix atan2
      
      * remove atan2
      
      * auto genereate inpalce api
      
      * fix inplace generate fn error
      
      * fix windows error
      
      * fix test error
      
      * fix test error
      
      * fix windows ci error
      
      * fix test error
      
      * fix test_error
      
      * fix test error
      
      * fix eigen aliasing error in inplace
      
      * remove elementwise_pow inplace
      
      * fix doc error
      
      * fix test error
      19302938
  10. 14 7月, 2023 1 次提交
  11. 11 7月, 2023 2 次提交
    • MarDino's avatar
      Integrate rmsnorm kernel (#54998) · 97d3d6ee
      MarDino 提交于
      * add rmsnorm kernel
      * add static graph test
      * fix round type
      * use alignas to avoid msvc compile error
      * remove redundant headerfile to avoid rocm compile error
      * fix rocm compile not found cub
      * Add document
      97d3d6ee
    • FormlessUnit's avatar
      Linear compress (#55128) · f4290a92
      FormlessUnit 提交于
      * rename weight_only/llm.int8
      f4290a92
  12. 10 7月, 2023 1 次提交
  13. 04 7月, 2023 1 次提交
  14. 03 7月, 2023 2 次提交
  15. 30 6月, 2023 3 次提交
  16. 28 6月, 2023 2 次提交
  17. 26 6月, 2023 2 次提交
  18. 20 6月, 2023 1 次提交
    • Z
      [IR] Change IR from Static library to dynamic library (#54729) · 24a3cb52
      zhangbo9674 提交于
      * new_ir to shared
      
      * refine code
      
      * add ir lib path to env
      
      * refine type
      
      * refine code
      
      * fix bug
      
      * fix bug
      
      * refine code
      
      * refine code
      
      * close win
      
      * refine code
      
      * refine code
      
      * refine code
      
      * add win share
      
      * refine code
      
      * refie code
      
      * refine code
      
      * refine code
      
      * refien code
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * solve conflict
      
      * solve conflict
      
      * fix bug
      
      * refine code
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * refine code
      
      * fix interpretercore program bug
      
      * delete unuse code
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix cinn bug
      
      * fix cinn bug
      
      * debug
      
      * fix cinn bug
      
      * delete unused code
      
      * fix cinn bug
      
      * fix cinn bug
      
      * fix  ug
      
      * test win openblas
      
      * test win openblas
      
      * fix win openblas bug
      
      * polish code
      
      * fix win open blas bug
      
      * close win dll
      
      * fix flag bug
      
      * test for windows
      
      * fix compile bug
      24a3cb52
  19. 16 6月, 2023 1 次提交
  20. 15 6月, 2023 1 次提交
  21. 14 6月, 2023 1 次提交
  22. 13 6月, 2023 1 次提交
  23. 09 6月, 2023 1 次提交
  24. 08 6月, 2023 1 次提交
    • Y
      [AMP] Add check_numerics API. (#54301) · a5444592
      Yiqun Liu 提交于
      * Add outputs to check_numerics_kernel.
      
      * Add check_numerics to yaml.
      
      * Add API and unittest.
      
      * Add check_nan_inf_level as argument of check_numerics_kernel.
      
      * Add more unittests.
      
      * Fix static API implementation and unittest.
      
      * Move the implementation of check_numerics to paddle.amp.
      
      * Fix import error.
      a5444592
  25. 05 6月, 2023 3 次提交
  26. 02 6月, 2023 1 次提交
  27. 01 6月, 2023 2 次提交
  28. 23 5月, 2023 2 次提交
  29. 18 5月, 2023 1 次提交
    • R
      support auto generate for op layer_norm (#53178) · 4f07b653
      RedContritio 提交于
      * simplify layer_norm_op.cc
      
      * support auto generate for op layer_norm
      
      * update unittest for composite_layer_norm
      
      * remove layer_norm_op.cc from scripts
      
      * replace layer_norm_op with generated_op
      
      * add get_expected_kernel for layer_norm
      
      * update cmake kernel register function for layer_norm_mkldnn_op
      4f07b653