1. 05 8月, 2021 12 次提交
  2. 04 8月, 2021 8 次提交
  3. 03 8月, 2021 7 次提交
  4. 02 8月, 2021 6 次提交
  5. 31 7月, 2021 1 次提交
  6. 30 7月, 2021 6 次提交
    • H
      Revert of PR34452 (#34516) · 72a9c8ff
      Huihuang Zheng 提交于
      72a9c8ff
    • J
      Added matmul_v2 BF16/FP32 BWD kernel (#34192) · 0be71571
      jakpiase 提交于
      * test version of matmul_v2
      
      * added matmul_v2 grad kernel
      
      * minor changes
      
      * minor changes
      
      * minor change for CI approval
      
      * CI fix
      
      * CI fix
      
      * trigger CI
      
      * changes after review, not working yet
      
      * moved ops to anonymous namespaces
      
      * changes after review
      0be71571
    • L
      [NPU] disable EmbeddingDenseGrad temporarily (#34498) · 2ad1e4c7
      Leo Chen 提交于
      2ad1e4c7
    • J
      Added reshape, reshape2, squeeze and squeeze2 BF16/FP32 FWD/BWD kernels (#34219) · 22c4c189
      jakpiase 提交于
      * test version of matmul_v2
      
      * added matmul_v2 grad kernel
      
      * minor changes
      
      * minor changes
      
      * minor change for CI approval
      
      * CI fix
      
      * CI fix
      
      * added squeeze and squeeze2 kernels
      
      * CI fix
      
      * CI fix
      
      * CI fix
      
      * disabled tests when compiled with cuda
      
      * added setting format_tag by strides
      
      * added sigmoid BF16 FWD/BWD and gelu BF16 BWD
      
      * changes after review
      
      * Revert "added sigmoid BF16 FWD/BWD and gelu BF16 BWD"
      
      This reverts commit 6e3f76720b545abfcff9f6052b46b73a1e745cae.
      
      * Revert "Merge branch 'matmul_v2_grad' into squeeze2_op"
      
      This reverts commit 06fcf67843a4a7884eccdf67a02a03575e1d4cb8, reversing
      changes made to 6e3f76720b545abfcff9f6052b46b73a1e745cae.
      
      * minor change
      
      * added reshape1/2 kernels
      
      * moved some functions into private block
      
      * CI fix
      
      * CI fix
      
      * CI fix
      22c4c189
    • W
      add trainer desc config to distributed strategy (#34457) · e6aacd1e
      wangguanqun 提交于
      * add trainer desc config to distributed strategy
      
      * code style modified
      e6aacd1e
    • J
      Added expand_v2 BF16/FP32 FWD/BWD kernels (#34284) · 41c4f723
      jakpiase 提交于
      * added expand_v2 bf16/fp32 kernel
      
      * minor change
      
      * CI fix
      
      * added missing test file
      
      * added formatting
      
      * reduced binary size
      
      * CI fix
      41c4f723