1. 30 9月, 2022 1 次提交
    • S
      support pure bfloat16 for more ops (#46364) · b7b231a6
      sneaxiy 提交于
      * support pure bfloat16
      
      * support bf16 linear
      
      * update PR to pass CI
      
      * tiny fix where_grad_kernel.cu
      
      * add bfloat16 to selu_grad to pass CI
      
      * fix selu grad compilation error
      b7b231a6
  2. 28 9月, 2022 1 次提交
    • C
      Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e
      Chen Weihang 提交于
      * remove needless using tensor
      
      * remove needless using tensor
      
      * resolve conflict
      
      * replace tensor using
      
      * fix format error
      
      * revert needless changing
      
      * fix rocm and npu compile error
      
      * fix cinn compile error
      
      * fix format error
      
      * fix mkldnn format error
      
      * fix mkldnn format error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * resolve conflict
      e12a905e
  3. 21 9月, 2022 1 次提交
  4. 18 9月, 2022 1 次提交
  5. 15 9月, 2022 1 次提交
  6. 09 9月, 2022 2 次提交
  7. 08 9月, 2022 2 次提交
  8. 07 9月, 2022 1 次提交
  9. 01 9月, 2022 1 次提交
  10. 31 8月, 2022 1 次提交
  11. 23 8月, 2022 1 次提交
  12. 17 8月, 2022 1 次提交
  13. 16 8月, 2022 1 次提交
    • F
      convert multihead to oss (#45019) · f706d95d
      feng_shuai 提交于
      * convert multihead to oss
      
      * fix:bug
      
      * fix:delete const cast
      
      * fix:don't support bias_qk
      
      * add vit pass
      
      * fix:convert bug and add preln_residual_bias
      
      * support length=-1
      
      * add UT for convert
      
      * add no_bias_qk support for gpu_multihead_op
      
      * delete infer_shape depends on bias_qk
      
      * oss just can be used in T4 and A*
      
      * fix:change api for ROCM CI
      f706d95d
  14. 15 8月, 2022 2 次提交
  15. 09 8月, 2022 1 次提交
  16. 05 8月, 2022 1 次提交
  17. 02 8月, 2022 1 次提交
  18. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  19. 29 7月, 2022 3 次提交
  20. 26 7月, 2022 1 次提交
  21. 19 7月, 2022 2 次提交
  22. 18 7月, 2022 1 次提交
  23. 13 7月, 2022 1 次提交
  24. 12 7月, 2022 1 次提交
  25. 08 7月, 2022 2 次提交
  26. 07 7月, 2022 2 次提交
  27. 06 7月, 2022 2 次提交
  28. 02 7月, 2022 1 次提交
    • L
      unify cpu context, part2 (#44012) · 755438a7
      Leo Chen 提交于
      * fix init()
      
      * delete test_device_context
      
      * replace CPUDeviceContext with CPUContext
      
      * fix test_scalar
      
      * remove dot_op.cc
      
      * fix compile
      755438a7
  29. 01 7月, 2022 1 次提交
    • L
      Addition of switch_auto_tune option for transpose op (#43310) · 53d5abe3
      limingshu 提交于
      * 2nd part of transpose update
      
      * add switch_auto_tune option.
      
      * add some changes according to Ci
      
      * refine the structure of auto_tune_base.
      
      * merge develop changes
      
      * reset the switch_set_range and change unittest of transpose auto-tune
      
      * change the kernel auto-tune logits
      53d5abe3
  30. 30 6月, 2022 2 次提交