1. 26 5月, 2021 1 次提交
  2. 25 5月, 2021 1 次提交
  3. 24 5月, 2021 1 次提交
  4. 11 5月, 2021 1 次提交
  5. 07 5月, 2021 1 次提交
  6. 27 4月, 2021 1 次提交
  7. 25 4月, 2021 3 次提交
  8. 22 4月, 2021 2 次提交
  9. 21 4月, 2021 1 次提交
  10. 20 4月, 2021 1 次提交
  11. 19 4月, 2021 1 次提交
  12. 17 4月, 2021 1 次提交
  13. 07 4月, 2021 1 次提交
    • Z
      【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3
      zhang wenhui 提交于
      * Ascend rc (#30483)
      
      * Fix compilcation on CANN20.1 and older (#30494)
      
      Fix compilcation on CANN20.1 and older
      
      * Add distribution supported (#30578)
      
      Add distribution supported
      
      * Build praser for Hcom* operators (#30627)
      
      Build praser for Hcom* operators
      
      * Pass device_ids info from launch to trainer. (#30632)
      
      Pass device_ids info from launch to trainer
      
      * Add Hccl program group (#30642)
      
      Add Hccl program group
      
      * Add startup bash files of test_ascend_group. (#30645)
      
      Add startup bash files of test_ascend_group
      
      * cleanup (#30646)
      
      cleanup test_ascend_group.py
      
      * [Feature] Build parser to support distributed training (#30658)
      
      [Feature] Build parser to support distributed training
      
      * fix compilation on ascend-20.1 (#30722)
      
      fix compilation on ascend-20.1
      
      * Dev/fix ascend string (#30749)
      
      Dev/fix ascend string
      
      * code style (#30781)
      
      code style
      
      * Merge ascend_optimizer and ascend_parser. (#30776)
      
      Merge ascend_optimizer and ascend_parser.
      
      * Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)
      
      Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug
      
      * Add paddle ascend distribution training supported (#30796)
      
      Add paddle ascend distribution training supported
      
      * pass cxx_flags to gloo cmake (#30857)
      
      * Destroy session first. (#30954)
      
      Destroy session first.
      
      * merge
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style, test=develop
      
      * fix, test=develop
      
      * fix
      
      * fix log fatal, test=develop
      
      * fix enforce style, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix rccl, test=develop
      
      * fix test, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix node_num, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
      Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
      Co-authored-by: Ndingsiyu <18369187719@163.com>
      Co-authored-by: NOleNet <olenet@126.com>
      8c7c53b3
  14. 06 4月, 2021 1 次提交
  15. 01 4月, 2021 2 次提交
  16. 15 3月, 2021 1 次提交
  17. 24 2月, 2021 1 次提交
  18. 20 2月, 2021 1 次提交
  19. 01 2月, 2021 1 次提交
  20. 21 1月, 2021 1 次提交
  21. 20 1月, 2021 1 次提交
  22. 12 1月, 2021 2 次提交
  23. 31 12月, 2020 1 次提交
  24. 24 12月, 2020 1 次提交
  25. 22 12月, 2020 1 次提交
  26. 09 12月, 2020 1 次提交
  27. 08 12月, 2020 1 次提交
  28. 04 12月, 2020 1 次提交
  29. 03 12月, 2020 2 次提交
  30. 01 12月, 2020 2 次提交
  31. 27 11月, 2020 2 次提交
  32. 26 11月, 2020 1 次提交
    • J
      [sharding] doc, api, bug fixed (#28983) · 0dadacc4
      JZ-LIANG 提交于
      * add lars to fleet meta optimizer
      
      * add lamb to proto
      
      * add lamb to fleet meta optimizer
      
      * fixed syntax bug
      
      * fixed syntax bug
      
      * fixed syntax error in lamb, add config setter of lamb in distributed_strategy
      
      * trigger unitest to rerun
      
      * add new unitest func for lamb
      
      * revise unitest for lars and lamb
      
      * revise dgc meta unitest
      
      * revise lars document in distribute_strategy
      
      * revise lars lamb document in distributed_strategy.py
      
      * revise lars lamb document in distributed_strategy.py
      
      * add weight decay exclude logic to lars
      
      * restore optimzier.py
      
      * restore optimizer.py as develop except lars
      
      * add epsilon and exclude fn to distributed_sttrategy
      
      * add lars epsilon
      
      * revise unitest for fleet lars and lamb
      
      * revise lars lamb unitest for CI coverage
      
      * revise lars argument api
      
      * revise lars argument api
      
      * revise lars argument api
      
      * revise api doc of lars
      
      * fix op role
      
      * add sharding save and add_sync_comm_for_test function
      
      * add comm_analyse to utlis
      
      * revise sharding_utils
      
      * add sharding saving unittest
      
      * revise sharding utils for unittest
      
      * revise sharding en doc
      
      * update sharding utils api
      
      * add doc for sharding
      
      * fixed bug in sharding var size count
      
      * update varsize count in sharding
      
      * fix sharding num_nccl_comm
      
      * Revert "fix sharding num_nccl_comm"
      
      This reverts commit d51587c15e9323acf226ddd36154275f0d1daf76.
      0dadacc4