1. 16 8月, 2023 2 次提交
  2. 15 8月, 2023 1 次提交
  3. 14 8月, 2023 1 次提交
    • MarDino's avatar
      Add rmsnorm residual bias add and quant (#55965) · 2ac6a7e4
      MarDino 提交于
      * add rmsnorm residual bias add and quant
      
      * refine python interface
      
      * add rmsnorm unittest
      
      * Add layernorm
      
      * fix layernorm unittest
      
      * refine unittest
      
      * fix example code
      
      * fix review comment
      2ac6a7e4
  4. 10 8月, 2023 1 次提交
    • L
      Add variable_length_memory_efficient_attention (#55400) · 4036c937
      lzy 提交于
      * add variable_length_memory_efficient_attention
      * update variable_length_memory_efficient_attention unittest
      * update variable_length_mem_eff_attn's docs and unittest
      * update variable_length_mem_eff_attn's docs
      * Update test_variable_length_memory_efficient_attention.py
      * Update variable_length_memory_efficient_attention.cu
      * fix codestyle
      * fix variable_length_fmha's docs and unittest
      * fix variable_length_fmha's docs
      4036c937
  5. 09 8月, 2023 1 次提交
  6. 08 8月, 2023 2 次提交
  7. 03 8月, 2023 3 次提交
  8. 02 8月, 2023 3 次提交
  9. 01 8月, 2023 1 次提交
  10. 31 7月, 2023 1 次提交
  11. 28 7月, 2023 1 次提交
  12. 26 7月, 2023 1 次提交
  13. 24 7月, 2023 1 次提交
  14. 20 7月, 2023 1 次提交
  15. 19 7月, 2023 2 次提交
  16. 13 7月, 2023 4 次提交
  17. 12 7月, 2023 2 次提交
    • H
      Support selected rows new ir (#54987) · fc66b5d7
      hong 提交于
      * refine program translator
      
      * fix warning: not override
      
      * fix bug
      
      * merge new modifications
      
      * modify by reviews
      
      * resolve conflicts
      
      * resolve conflicts
      
      * fix
      
      * fix
      
      * update
      
      * support selected rows
      
      * update
      
      * add selectrows
      
      * fix bug
      
      * add ut
      
      * refine code
      
      * refien code
      
      * update
      
      * update
      
      * support selected rows
      
      * support selected rows
      
      * support dense tensor
      
      * remove useless code
      
      * polish code
      
      * remote standalone executor test
      
      ---------
      Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
      Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
      fc66b5d7
    • W
      [clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7
      Wang Xin 提交于
      * [clang-tidy] enable readability-container-size-empty check
      
      * fix test_custom_kernel Failed
      
      * add clang-tid-10 in dockerfile
      
      * add clang-tidy in dockerfile
      
      * fix bug
      be3a6fa7
  18. 11 7月, 2023 3 次提交
    • P
      support sharding parallel (#54634) · b7a05057
      pangengzheng 提交于
      * support sharding parallel
      
      * fix name
      
      * fix
      
      * update
      
      * test amp for sharding
      
      ---------
      
      Co-authored-by: pangengzheng <pangengzheng.baidu.com>
      b7a05057
    • MarDino's avatar
      Integrate rmsnorm kernel (#54998) · 97d3d6ee
      MarDino 提交于
      * add rmsnorm kernel
      * add static graph test
      * fix round type
      * use alignas to avoid msvc compile error
      * remove redundant headerfile to avoid rocm compile error
      * fix rocm compile not found cub
      * Add document
      97d3d6ee
    • FormlessUnit's avatar
      Linear compress (#55128) · f4290a92
      FormlessUnit 提交于
      * rename weight_only/llm.int8
      f4290a92
  19. 07 7月, 2023 1 次提交
  20. 05 7月, 2023 2 次提交
  21. 03 7月, 2023 2 次提交
  22. 30 6月, 2023 1 次提交
  23. 29 6月, 2023 1 次提交
    • N
      Add fused_rope forward op (#54351) · a215c46a
      niuliling123 提交于
      * style
      
      * more
      
      * update ctest
      
      * Update legacy_backward.yaml
      
      * Update legacy_ops.yaml
      
      * Update legacy_ops.yaml
      
      * update
      
      * update
      
      * update for move
      a215c46a
  24. 28 6月, 2023 1 次提交
  25. 27 6月, 2023 1 次提交