1. 25 2月, 2021 2 次提交
    • qq_22305325's avatar
      Mig stateless call (#4215) · 8c0dca90
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * mig OpKernelObject
      
      * mig object_storage
      
      * make of_format
      
      * del comment
      
      * std::function<void(Object*)
      
      * mig NewOpKernelObject and _StatefulCallOpKernel
      
      * mig _StatefulCallOpKernel and GetSharedOpKernelObject4ParallelConfSymbol
      
      * del object_storage.cpp
      
      * use name GLOBAL_PARA_SYM2SHARED_OPKENEL_OBJ_MUTEX
      
      * mig CheckRefInBlobObjectParallelDesc and  OperandBlobObjects rel api
      
      * mig _StatelessCall
      
      * mig _StatelessCall
      
      * del comment
      
      * use cfg_op_conf and Object*
      
      * use Object*
      
      * del _
      
      * fix func name error
      
      * use MapAt and shared_ptr
      
      * use shared_ptr or const ref
      
      * minor fix
      
      * add todo
      
      * minor fix
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      8c0dca90
    • C
      Reduce unittest cost (#4253) · 2981ca73
      cheng cheng 提交于
      * reduce unittest cost
      
      * format
      
      * refactor unittest for reduce cost
      
      * refine code for review
      
      * moments test double
      
      * rollback top k shape dim
      
      * fix axis err in test_prelu
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      2981ca73
  2. 24 2月, 2021 7 次提交
  3. 23 2月, 2021 8 次提交
  4. 22 2月, 2021 4 次提交
  5. 21 2月, 2021 1 次提交
  6. 20 2月, 2021 5 次提交
  7. 19 2月, 2021 3 次提交
  8. 18 2月, 2021 2 次提交
    • C
      NCCL use compute stream to memory cost & speed up (#4221) · 45697b0c
      cheng cheng 提交于
      * Enable insert nccl logical op pass
      
      * FindMaxConnectedSubgraphForGpuExecOrder~
      
      * through order and interface
      
      * implement of insert nccl logical op in pass
      
      * add nccl logical op using UserOp Implement and EagerNcclCommMgr
      
      * add NCCL ReduceScatter op/kernel; refine pass impl of topo order
      
      * add NCCL logical op/kernel AllGather
      
      * fix bug of reduce scatter/ all gather infer shape
      
      * refine log and note
      
      * fix complier err build with CPU ONLY
      
      * support NCCL ALL2ALL and test pass of alexnet model parallel
      
      * rollback of diff in checkpointing_pass.cpp
      
      * rename to nccl_use_compute_stream; ResourceDesc::nccl_use_compute_stream; refine name for review; create nccl_comm_ in KernelCompute;
      
      * refine code for review
      
      * add unittest for nccl use compute stream
      
      * format test scripts
      
      * refine align
      45697b0c
    • J
      Refactor InferBatchAxis (#4219) · 5d259566
      Juncheng 提交于
      * Refactor InferBatchAxis
      
      * refine
      5d259566
  9. 17 2月, 2021 1 次提交
    • L
      Multi reentrant lock (#4225) · 3d33bde2
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * vectorizedly link main job
      
      * resize vectorh identity_tick_op_names then access elements
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      
      * fix a bug in CaseCompTaskNode; fix a bug when create identity tick in main_job
      
      * 1) Insert tick between sourc tick and src_subset_tick; 2) Insert tick between dst_subset_tick and sink tick
      
      * stash code
      
      * refactor MakeMainJob by using Range::ForEachSubRange
      
      * refactor MakeMainJob by using Range::ForEachSubRange
      
      * rename ReentrantLockLinkPoint to ReentrantLockBackEdge
      
      * set piece id for regst sent by wait_and_send_ids actor
      
      * callback_notifier_sink_tick
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      3d33bde2
  10. 15 2月, 2021 1 次提交
    • L
      Vectorized linking main job (#4223) · b4fcfd50
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * vectorizedly link main job
      
      * resize vectorh identity_tick_op_names then access elements
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      
      * fix a bug in CaseCompTaskNode; fix a bug when create identity tick in main_job
      
      * 1) Insert tick between sourc tick and src_subset_tick; 2) Insert tick between dst_subset_tick and sink tick
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      b4fcfd50
  11. 14 2月, 2021 1 次提交
    • L
      Sink tick in main job (#4207) · 25d9c26c
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      25d9c26c
  12. 09 2月, 2021 5 次提交
    • L
      Tick per machine (#4204) · c95516da
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * fix typo in task_graph
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      c95516da
    • qq_22305325's avatar
      Mig opkernel obj (#4212) · b711cf0c
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * mig OpKernelObject
      
      * mig object_storage
      
      * make of_format
      
      * del comment
      
      * del comment
      
      * use cfg_op_conf and Object*
      
      * use Object*
      b711cf0c
    • O
      f9268788
    • qq_22305325's avatar
      Mig op conf sym (#4213) · aea03748
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * del comment
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      aea03748
    • Z
      Add CTC Loss (#4034) · d97b0218
      Zhenhua 提交于
      * Add CTC Loss
      
      * Add backward kernel
      
      * Remove tf in test
      
      * Update api document
      
      * Add zero_infinity option
      
      * refine
      
      * Add 1n2d test case
      
      * Switch to consistent_view
      
      * Fix Eager mode
      
      * Remove duplicate license
      
      * Add grad check
      
      * Fix bw test
      
      * Fix bugs
      
      * Add op name
      
      * Refine
      
      * of_format
      
      * Expand annotation
      
      * Performance optimizing for cuda
      
      * Check input_length & target_lengths
      
      * Update __syncthreads
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      d97b0218