1. 20 3月, 2023 17 次提交
  2. 18 3月, 2023 1 次提交
  3. 17 3月, 2023 15 次提交
  4. 16 3月, 2023 7 次提交
    • R
      Comp index select (#51215) · d1e2c61b
      Roc 提交于
      d1e2c61b
    • H
      [Custom Operator] Custom op support inplace mechanism (#51620) · f824bc0d
      HongyuJia 提交于
      * init unit test commit, contains register thinking
      
      * support inplace
      
      * get inplaced x.grad
      
      * Try support inplace and hook at the same time
      
      * Support inplace, need debug
      
      * Support inplace successfully
      
      * Inplace use Tensor&, consistent with Tensor*
      
      * fix MapPlainOutputs bug
      
      * fix double grad inplace error
      f824bc0d
    • C
      rename flash_attn_raw to flash_attn_unpadded (#51704) · 0b778bdc
      Chitsing KUI 提交于
      * rename flash_attn_raw to flash_attn_unpadded
      
      * fix static api
      
      * fix static return
      0b778bdc
    • X
      Add Deformable Conv Dynamic Shape Support (#50698) · 86bf8274
      xjmxyt 提交于
      * add dynamic support
      
      * add more test
      
      * fix bug
      
      * change test
      
      * change test
      86bf8274
    • shaojie_wang's avatar
      add fp32 grad plus fp16 param in adamw (#51141) · 290aa368
      shaojie_wang 提交于
      * add fp32 grad plus fp16 param in adamw
      
      * add python UT
      
      * fix test case
      
      * in test_adamw_op py file, force the moment2 value LE 0
      
      * add a compare option
      
      * remove bf16 fused adam kernel case
      290aa368
    • J
      [Auto Parallel Performance] Support BF16 Training (#51285) · 9ded5707
      JZ-LIANG 提交于
      * update env setting
      
      * update pass logic
      
      * dist op support bf16
      
      * backward cast update
      
      * update setting
      
      * update backward
      
      * revert amp pass
      
      * update fp16 backward logic
      
      * register c_embedding bf16
      
      * revert engine
      
      * add unitest
      
      * add unitest
      
      * update unitest
      
      * update cmake
      
      * update math
      
      * update math.py
      
      * update unitest
      
      * update unitest
      
      * revise unitest
      
      * revise unitest
      
      * update unitest
      
      * update unitest
      
      * update unitest
      9ded5707
    • W
      split layernorm pass (#51228) · 3f3372b6
      wenbin 提交于
      * split pass
      
      * fix compile
      
      * fix ut
      
      * more time
      
      * modify ut
      
      * reduce dim
      
      * fix compile
      
      * reshape weight
      
      * tensor
      
      * remove enforce
      
      * static shape ut
      
      * batchsize
      
      * reorder pass
      
      * minus test cases
      
      * windows timeout
      
      * windows time out
      
      * remove test for windows
      
      * correct
      
      * sssss
      
      * xxx
      3f3372b6