1. 14 8月, 2023 1 次提交
    • MarDino's avatar
      Add rmsnorm residual bias add and quant (#55965) · 2ac6a7e4
      MarDino 提交于
      * add rmsnorm residual bias add and quant
      
      * refine python interface
      
      * add rmsnorm unittest
      
      * Add layernorm
      
      * fix layernorm unittest
      
      * refine unittest
      
      * fix example code
      
      * fix review comment
      2ac6a7e4
  2. 10 8月, 2023 1 次提交
    • L
      Add variable_length_memory_efficient_attention (#55400) · 4036c937
      lzy 提交于
      * add variable_length_memory_efficient_attention
      * update variable_length_memory_efficient_attention unittest
      * update variable_length_mem_eff_attn's docs and unittest
      * update variable_length_mem_eff_attn's docs
      * Update test_variable_length_memory_efficient_attention.py
      * Update variable_length_memory_efficient_attention.cu
      * fix codestyle
      * fix variable_length_fmha's docs and unittest
      * fix variable_length_fmha's docs
      4036c937
  3. 31 7月, 2023 1 次提交
  4. 30 7月, 2023 1 次提交
  5. 13 7月, 2023 2 次提交
  6. 11 7月, 2023 1 次提交
    • MarDino's avatar
      Integrate rmsnorm kernel (#54998) · 97d3d6ee
      MarDino 提交于
      * add rmsnorm kernel
      * add static graph test
      * fix round type
      * use alignas to avoid msvc compile error
      * remove redundant headerfile to avoid rocm compile error
      * fix rocm compile not found cub
      * Add document
      97d3d6ee
  7. 10 7月, 2023 1 次提交
  8. 06 7月, 2023 1 次提交
  9. 04 7月, 2023 2 次提交
  10. 03 7月, 2023 1 次提交
  11. 29 6月, 2023 1 次提交
  12. 28 6月, 2023 1 次提交
  13. 27 6月, 2023 1 次提交
  14. 25 6月, 2023 1 次提交
  15. 20 6月, 2023 1 次提交
  16. 15 6月, 2023 1 次提交
  17. 14 6月, 2023 3 次提交
    • G
      Fix cuda12 timeout problems. (#54615) · a90d9088
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Fix problem of pickle and NCCL_P2P_DISABLE in distributed testcases in
      cuda12.
      
      * Fix problem of TimeOut of distributed testcases under cuda12.
      
      * Remove useless modification.
      
      * Remove useless modification.
      a90d9088
    • C
      [prim] move batch_norm prim test to op_test (#54458) · 58b4c60f
      cyber-pioneer 提交于
      * move batch_norm prim test to op_test
      
      * fix optest bug
      
      * add test to cmake
      
      * add cinn test case
      
      * fix batch_norm prim grad bf16
      
      * fix code
      
      * add cuda check
      
      * fix batch_norm bfloat16
      
      * fix cpu bfloat16 bug
      
      * skip non-bfloat16-supported platform
      
      * fix code
      
      * fix cinn rtol and atol in bfloat16
      
      * fix name
      
      * fix config
      58b4c60f
    • C
      f7eb03c6
  18. 13 6月, 2023 3 次提交
  19. 12 6月, 2023 1 次提交
  20. 08 6月, 2023 2 次提交
  21. 07 6月, 2023 1 次提交
  22. 05 6月, 2023 1 次提交
  23. 02 6月, 2023 1 次提交
  24. 01 6月, 2023 2 次提交
    • C
      [AMP Prim OP]support bf16 dtype for layer_norm prim op (#54236) · e3fcbb8f
      Charles-hit 提交于
      * support layer_norm prim op bf16 dtype
      
      * polish code
      
      * resolve conflict
      e3fcbb8f
    • T
      mv all unittests test (#53235) · b0e86d55
      tianshuo78520a 提交于
      * mv all unittests test
      
      * fix error
      
      * fix error
      
      * fix
      
      * fix
      
      * del unittests
      
      * fix paddle_build.sh
      
      * fix
      
      * fix test
      
      * fix add test
      
      * fix
      
      * fix
      
      * fix
      
      * merge develop
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * merge develop
      
      * fix test_async_read_write
      
      * fix test_async_read_write
      
      * merge develop
      
      * fix
      
      * fix import legacy_test
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix bug
      
      * fix
      
      * fix coverage test bug
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix code sstyle
      
      * fix code
      
      * fix code
      
      * fix
      
      * fix
      
      * fix
      
      * del test_sequence_enumerate_op.py
      
      * fix
      b0e86d55
  25. 22 5月, 2023 1 次提交
  26. 18 5月, 2023 1 次提交
  27. 23 3月, 2023 1 次提交
  28. 22 3月, 2023 1 次提交
  29. 20 3月, 2023 1 次提交
  30. 23 12月, 2022 1 次提交
  31. 14 6月, 2022 1 次提交
  32. 04 6月, 2022 1 次提交