1. 18 8月, 2023 16 次提交
  2. 17 8月, 2023 19 次提交
  3. 16 8月, 2023 5 次提交
    • C
      Rename desctensor (#56334) · 6a42ddc6
      Charles-hit 提交于
      * add flags for cinn test
      
      * rename DescTensor
      
      * remove useless code
      
      * modify code style
      
      * modify code style
      
      * modify code style
      6a42ddc6
    • H
      [NewIR]New ir support vector type place transfer (#56328) · 2a378ff5
      hong 提交于
      * fix op translator reshape type
      
      * new ir support vector type place transfer
      
      * add test case
      2a378ff5
    • G
      Add mp_all_reduce asynchronize overlap. (#55662) · 6b1dfb5f
      Ghost Screaming 提交于
      * [WIP] Add mp_all_reduce asynchronize overlap.
      
      * Fix some problems.
      
      * Fix dw compute bug, and use a temporary solution to achieve overlap.
      
      * Use fused_linear_param_grad_add to compute dw.
      
      * Reformat ColumnParallel _overlap_linear. Use environment flags to
      control following behaviors:
      1. export Flags_mp_aysnc_allreduce=True to turn on mp async all_reduce
      2. export Flags_skip_mp_c_identity=True to skip two c_identity operators
         in dygraph mode.
      3. export Flags_fused_linear_param_grad_add to enable fused_linear_param_grad_add
         in ColumnParallel backward with mp async all_reduce.
      
      * Polish code.
      
      * Remove useless communication API.
      
      * Fix some problems in mp_async_all_reduce and skip_c_identity.
      
      * Add test cases.
      
      * Remove environment variable Flags_fused_linear_param_grad_add in test case.
      
      * Reset error threshold.
      
      * Reset threshold in test case.
      
      * Add useful log. Remove useless test cases.
      6b1dfb5f
    • L
      [NewIR] support c_broadcast (#56284) · a8981be0
      Leo Chen 提交于
      * [NewIR] support c_broadcast
      
      * add legacyOpList
      a8981be0
    • MarDino's avatar
      Refine FusedNorm comment (#56305) · 12547fb4
      MarDino 提交于
      * refine static op return val
      12547fb4