1. 19 2月, 2022 2 次提交
    • W
      [Eager Hook] Support ReduceHook in GradNodeAccumulation (#39674) · 06b177c0
      Weilong Wu 提交于
      * [Eager] Support GradientHook before running seperate GradNode
      
      * Fix CI issue
      
      * Support eager ReduceHook in accumulation_node
      
      * Fix CI  issue
      
      * Add some tests to fix coverage CI issue
      06b177c0
    • S
      Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61
      sneaxiy 提交于
      * add DistributedFusedLamb op
      
      * polish code
      
      * fix compile error
      
      * compatible with pten changement
      
      * fix rocm compile error
      
      * improve converage
      
      * update upstream/develop
      
      * fix cast_with_ptr.h
      
      * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1
      
      * fix clip before allreduce
      
      * add use_master_param_norm
      
      * code polish
      
      * fix bug
      
      * fix ROCM ci
      5df3cd61
  2. 18 2月, 2022 25 次提交
  3. 17 2月, 2022 13 次提交