1. 22 8月, 2023 8 次提交
  2. 21 8月, 2023 17 次提交
  3. 19 8月, 2023 1 次提交
  4. 18 8月, 2023 4 次提交
  5. 17 8月, 2023 8 次提交
  6. 16 8月, 2023 2 次提交
    • G
      Add mp_all_reduce asynchronize overlap. (#55662) · 6b1dfb5f
      Ghost Screaming 提交于
      * [WIP] Add mp_all_reduce asynchronize overlap.
      
      * Fix some problems.
      
      * Fix dw compute bug, and use a temporary solution to achieve overlap.
      
      * Use fused_linear_param_grad_add to compute dw.
      
      * Reformat ColumnParallel _overlap_linear. Use environment flags to
      control following behaviors:
      1. export Flags_mp_aysnc_allreduce=True to turn on mp async all_reduce
      2. export Flags_skip_mp_c_identity=True to skip two c_identity operators
         in dygraph mode.
      3. export Flags_fused_linear_param_grad_add to enable fused_linear_param_grad_add
         in ColumnParallel backward with mp async all_reduce.
      
      * Polish code.
      
      * Remove useless communication API.
      
      * Fix some problems in mp_async_all_reduce and skip_c_identity.
      
      * Add test cases.
      
      * Remove environment variable Flags_fused_linear_param_grad_add in test case.
      
      * Reset error threshold.
      
      * Reset threshold in test case.
      
      * Add useful log. Remove useless test cases.
      6b1dfb5f
    • MarDino's avatar
      Refine FusedNorm comment (#56305) · 12547fb4
      MarDino 提交于
      * refine static op return val
      12547fb4