1. 18 8月, 2023 12 次提交
  2. 17 8月, 2023 19 次提交
  3. 16 8月, 2023 9 次提交
    • C
      Rename desctensor (#56334) · 6a42ddc6
      Charles-hit 提交于
      * add flags for cinn test
      
      * rename DescTensor
      
      * remove useless code
      
      * modify code style
      
      * modify code style
      
      * modify code style
      6a42ddc6
    • H
      [NewIR]New ir support vector type place transfer (#56328) · 2a378ff5
      hong 提交于
      * fix op translator reshape type
      
      * new ir support vector type place transfer
      
      * add test case
      2a378ff5
    • G
      Add mp_all_reduce asynchronize overlap. (#55662) · 6b1dfb5f
      Ghost Screaming 提交于
      * [WIP] Add mp_all_reduce asynchronize overlap.
      
      * Fix some problems.
      
      * Fix dw compute bug, and use a temporary solution to achieve overlap.
      
      * Use fused_linear_param_grad_add to compute dw.
      
      * Reformat ColumnParallel _overlap_linear. Use environment flags to
      control following behaviors:
      1. export Flags_mp_aysnc_allreduce=True to turn on mp async all_reduce
      2. export Flags_skip_mp_c_identity=True to skip two c_identity operators
         in dygraph mode.
      3. export Flags_fused_linear_param_grad_add to enable fused_linear_param_grad_add
         in ColumnParallel backward with mp async all_reduce.
      
      * Polish code.
      
      * Remove useless communication API.
      
      * Fix some problems in mp_async_all_reduce and skip_c_identity.
      
      * Add test cases.
      
      * Remove environment variable Flags_fused_linear_param_grad_add in test case.
      
      * Reset error threshold.
      
      * Reset threshold in test case.
      
      * Add useful log. Remove useless test cases.
      6b1dfb5f
    • L
      [NewIR] support c_broadcast (#56284) · a8981be0
      Leo Chen 提交于
      * [NewIR] support c_broadcast
      
      * add legacyOpList
      a8981be0
    • MarDino's avatar
      Refine FusedNorm comment (#56305) · 12547fb4
      MarDino 提交于
      * refine static op return val
      12547fb4
    • L
      move test case from cpp to python (#56333) · 163152aa
      LiYuRio 提交于
      163152aa
    • T
      Fix Mac Timeout Test (#56259) · 87bc6f2c
      tianshuo78520a 提交于
      * Fix Mac Timeout Test
      
      * fix test
      
      * fix
      
      * Fix test
      
      * Tesst
      
      * Fix mac test
      87bc6f2c
    • K
      [NewIR] Insert `get_parameter` only for paramters (#56325) · 0f611f18
      kangguangli 提交于
      * fix inset get_parameter op bug
      
      * fix bug: insert  only for parameters
      
      * fix bug: wrong idx in vector
      
      ---------
      Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
      0f611f18
    • B
      [CINN] Add ScheduleBlock graph (#56122) · da72707f
      BiynXu 提交于
      Added a graph data structure in units of ScheduleBlock and some necessary operations, such as finding upstream and downstream nodes, and performing operations in the DFS topological order.
      da72707f