1. 26 8月, 2022 1 次提交
    • R
      move collective tests into a collective directory (#45223) · 9eb4d89b
      Roc 提交于
      * add simple reformated ci files
      
      * update
      
      * add radme for new unitetsts
      
      * add radme for new unitetsts
      
      * add radme for new unitetsts
      
      * reset mlu
      
      * update for samples
      
      * add base api
      
      * reset some dist unit tests
      
      * add warning in grenerated cmakelists file
      
      * update readme for new dist unit tests
      
      * add all collective tests
      
      * remain base file and launcher file
      
      * Update README.md
      
      * Update README.md
      
      * fix env PYTHONPATH
      
      * Update gen_ut_cmakelists.py
      
      * add all collective tests
      
      * add docs for gen_ut_cmakelists.py
      
      * pretify codes
      
      * commont name == "name"
      
      * update for comments
      
      * update function's help
      
      * update for run type
      
      * update readme
      
      * add all collective tests
      
      * add all collective tests
      
      * mv  collective test files
      
      * update for all collective tests
      
      * update
      
      * update
      
      * update
      
      * update for all tests
      
      * update for checking name
      
      * Update Cmakelists.txt
      
      * update testlist.csv
      
      * remain test_parallel_dygraph_dataparallel in unittests
      
      * set broadcast op all platforms
      
      * update
      
      * remain test_broadcast_tensors_op
      
      * fix
      
      * rm some collective files
      
      * update more colective tests
      
      * update
      
      * update
      
      * update
      gen_ut_supports recursion
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix nccl version
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix a bug and try to pass
      
      * update
      
      * add csv
      
      * update for timeout
      
      * remove tcp store
      
      * fix
      
      * fix
      
      * update
      
      * update
      
      * update for more dist tests
      
      * move multi node tests
      
      * update
      
      * update
      
      * update
      
      * fix for auto parallele
      
      * update
      
      * update path in python file
      
      * update
      
      * reset some test in unittests
      
      * fix
      
      * update readme
      
      * fix
      
      * update
      
      * fix port
      9eb4d89b
  2. 05 6月, 2022 1 次提交
    • S
      【code format check upgrade】 step2:yapf (#42944) · a072fca8
      Sing_chan 提交于
      * use yapf to format all python file
      
      * yapf exclude two unittests file for they rely on writing and reading file, and format will break them
      
      * disable diff_py_file because too many diff files cause command following failed
      a072fca8
  3. 26 4月, 2021 1 次提交
  4. 22 9月, 2020 1 次提交
    • P
      Use dygraph mode by default (#27443) · 827ac36f
      pangyoki 提交于
      * default open dygraph mode
      
      * fix CI-Mac
      
      * fix Mac-CI other unittest file
      
      * fix CI-Py3
      
      * fix test_communicator_geo and test_buffer_shared_memory_reuse_pass
      
      * add enable_static to fix CI-Py3
      
      * add enable_static to fix CI-coverage
      
      * delete try except
      827ac36f
  5. 27 8月, 2020 1 次提交
  6. 03 12月, 2019 1 次提交
  7. 27 6月, 2019 1 次提交
    • H
      supports collective communicated training (#18175) · b7128bac
      HaoRen 提交于
      * fix prepare context redundant code problem, optimize executor by caching create_varaiables
      test=develop
      
      * supports collective training in executor
      
      * make fetch_list runable with variables, add more unittest for use_program_cache
      test=develop
      
      * fix comment
      test=develop
      
      * use unique name for nccl_id
      
      * supports output to stream in program_to_code
      
      * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
      
      * set op role in collective training
      
      * add collective op role
      
      * remove orig file
      
      * add build optimizer by strategy
      
      * add collective strategy
      
      * refine collective strategy
      
      * add multi-process role maker
      
      * refine strategy building factory so that we can easily plugin more strategy
      
      * scale loss grad in collective sgd transpiler
      
      * add support for distributed fc
      
      * code format
      
      * revert some features for dist fc
      
      * add support for distributed fc training
      
      * fix prepare context redundant code problem, optimize executor by caching create_varaiables
      test=develop
      
      * supports collective training in executor
      
      * make fetch_list runable with variables, add more unittest for use_program_cache
      test=develop
      
      * use unique name for nccl_id
      
      * supports output to stream in program_to_code
      
      * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
      
      * set op role in collective training
      
      * add collective op role
      
      * fix comment
      test=develop
      
      * remove orig file
      
      * add build optimizer by strategy
      
      * add collective strategy
      
      * refine collective strategy
      
      * add multi-process role maker
      
      * refine strategy building factory so that we can easily plugin more strategy
      
      * scale loss grad in collective sgd transpiler
      
      * add support for distributed fc
      
      * code format
      
      * revert some features for dist fc
      
      * add support for distributed fc training
      
      * test=develop
      add collective op unittest standard
      
      * test=develop
      remove the test_collective directory
      
      * test=develop
      remove the test_collective directory
      
      * remove slicegather test
      
      * code format for reducescatter
      
      * update attr of shard_index_op
      
      * Modify macro nccl_helper
      
      * remove test without distribute
      
      * macro collective_helper
      
      * marcro update
      
      * test=develop
      update support python3.5
      
      * test=develop change gpu memory use to 0.1 when test
      
      * test=develop
      update ut equal func
      
      * test=develop
      set flags to 1.5
      
      * test=develop fix pickle dumple  py35
      
      * test=develop
      fix divide in slice and add sync_comm_stream
      update atol and rtol to 1e-05
      rm shard_index op and test
      modify read input from file to read from memory
      remove origin_program in framework and add i/o in c_sync_calc_stream
      
      * test=develop update unittest sync operator I/O
      b7128bac