1. 17 11月, 2020 3 次提交
  2. 16 11月, 2020 8 次提交
  3. 15 11月, 2020 2 次提交
  4. 13 11月, 2020 3 次提交
  5. 12 11月, 2020 3 次提交
  6. 11 11月, 2020 2 次提交
  7. 09 11月, 2020 1 次提交
  8. 08 11月, 2020 1 次提交
  9. 07 11月, 2020 2 次提交
    • S
      use format (#3778) · 5a6dec3c
      Shenghang Tsai 提交于
      Co-authored-by: NTsai <caishenghang@oneflow.org>
      5a6dec3c
    • S
      Multi node support in CI (#3735) · 0fa62fac
      Shenghang Tsai 提交于
      * install RDMA driver
      
      * install gdb openssh-server openssh-client
      
      * add requests to test docker
      
      * rm
      
      * add remote launch
      
      * add ssh setting
      
      * add user name in tag
      
      * add --privileged --network=host
      
      * mv shell cmds to python
      
      * launch client
      
      * add more arg
      
      * refine launch
      
      * refactor oneflow worker support non 22 port
      
      * refactor tmp dotssh dir
      
      * run script working
      
      * refine test code
      
      * support more args and remote launch timeout 15=>10
      
      * refine oneflow worker copy
      
      * add log in comm net
      
      * refactor distributed run work dir
      
      * refactor distribute run
      
      * refactor oneflow worker
      
      * refactor unitest
      
      * fix socket port reuse
      
      * refine code
      
      * fix CHECK_EQ
      
      * skip TestDynamicBinary fornow
      
      * check in shell script
      
      * add more log
      
      * refactor oneflow worker
      
      * rm skip
      
      * add abort msg
      
      * ignore .cache
      
      * refactor cluster
      
      * abort working
      
      * cherry picking changes from multi node ci
      
      * call original exit in monkey patch
      
      * call original exit in monkey patch
      
      * add bazel_cache for XLA build
      
      * Revert "add bazel_cache for XLA build"
      
      This reverts commit 5903717d491fc5be8b1fd19cddadd826863c0d76.
      
      * global
      
      * dont del
      
      * check in script change
      
      * quick fix
      
      * support auto docker img build
      
      * revert changes for debug
      
      * support node get_affiliations
      
      * some hardcoded resolve
      
      * rename func
      
      * add git ignore
      
      * check img
      
      * add step
      
      * change to aliyun pypi mirror
      
      * mv pip install to the end
      
      * use https
      
      * mv distributed up
      
      * add make_dotssh flag
      
      * rm debug line
      
      * add a space in log
      Co-authored-by: Ntsai <caishenghang@oneflow.org>
      0fa62fac
  10. 06 11月, 2020 4 次提交
  11. 05 11月, 2020 3 次提交
  12. 03 11月, 2020 4 次提交
  13. 02 11月, 2020 3 次提交
  14. 31 10月, 2020 1 次提交