1. 20 2月, 2021 4 次提交
  2. 19 2月, 2021 3 次提交
  3. 18 2月, 2021 2 次提交
    • C
      NCCL use compute stream to memory cost & speed up (#4221) · 45697b0c
      cheng cheng 提交于
      * Enable insert nccl logical op pass
      
      * FindMaxConnectedSubgraphForGpuExecOrder~
      
      * through order and interface
      
      * implement of insert nccl logical op in pass
      
      * add nccl logical op using UserOp Implement and EagerNcclCommMgr
      
      * add NCCL ReduceScatter op/kernel; refine pass impl of topo order
      
      * add NCCL logical op/kernel AllGather
      
      * fix bug of reduce scatter/ all gather infer shape
      
      * refine log and note
      
      * fix complier err build with CPU ONLY
      
      * support NCCL ALL2ALL and test pass of alexnet model parallel
      
      * rollback of diff in checkpointing_pass.cpp
      
      * rename to nccl_use_compute_stream; ResourceDesc::nccl_use_compute_stream; refine name for review; create nccl_comm_ in KernelCompute;
      
      * refine code for review
      
      * add unittest for nccl use compute stream
      
      * format test scripts
      
      * refine align
      45697b0c
    • J
      Refactor InferBatchAxis (#4219) · 5d259566
      Juncheng 提交于
      * Refactor InferBatchAxis
      
      * refine
      5d259566
  4. 17 2月, 2021 1 次提交
    • L
      Multi reentrant lock (#4225) · 3d33bde2
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * vectorizedly link main job
      
      * resize vectorh identity_tick_op_names then access elements
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      
      * fix a bug in CaseCompTaskNode; fix a bug when create identity tick in main_job
      
      * 1) Insert tick between sourc tick and src_subset_tick; 2) Insert tick between dst_subset_tick and sink tick
      
      * stash code
      
      * refactor MakeMainJob by using Range::ForEachSubRange
      
      * refactor MakeMainJob by using Range::ForEachSubRange
      
      * rename ReentrantLockLinkPoint to ReentrantLockBackEdge
      
      * set piece id for regst sent by wait_and_send_ids actor
      
      * callback_notifier_sink_tick
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      3d33bde2
  5. 15 2月, 2021 1 次提交
    • L
      Vectorized linking main job (#4223) · b4fcfd50
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * vectorizedly link main job
      
      * resize vectorh identity_tick_op_names then access elements
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      
      * fix a bug in CaseCompTaskNode; fix a bug when create identity tick in main_job
      
      * 1) Insert tick between sourc tick and src_subset_tick; 2) Insert tick between dst_subset_tick and sink tick
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      b4fcfd50
  6. 14 2月, 2021 1 次提交
    • L
      Sink tick in main job (#4207) · 25d9c26c
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * add sink tick op in main_job
      
      * refactor LinkMainJob
      
      * fix typo in task_graph
      
      * refactor AddGlobalCriticalSection
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * add src_subset_tick for input-output critical section
      
      * refactor AutoSourceTick and AutoSinkTick
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      25d9c26c
  7. 09 2月, 2021 5 次提交
    • L
      Tick per machine (#4204) · c95516da
      Li Xinqi 提交于
      * source subset tick
      
      * remove useless header files
      
      * insert DstSubsetTickOp
      
      * remove incorrect CHECK
      
      * add tick op for each machine
      
      * TryBindBnWithOneofRegst
      
      * fix typo in task_graph
      
      * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs
      
      * SrcSubsetTickCompTaskNode: bind bns and in_regst if bns is valid in current device
      
      * refactor optional input to repeated inputs for SrcSubsetTickOpConf
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      c95516da
    • qq_22305325's avatar
      Mig opkernel obj (#4212) · b711cf0c
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * mig OpKernelObject
      
      * mig object_storage
      
      * make of_format
      
      * del comment
      
      * del comment
      
      * use cfg_op_conf and Object*
      
      * use Object*
      b711cf0c
    • O
      f9268788
    • qq_22305325's avatar
      Mig op conf sym (#4213) · aea03748
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * del comment
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      aea03748
    • Z
      Add CTC Loss (#4034) · d97b0218
      Zhenhua 提交于
      * Add CTC Loss
      
      * Add backward kernel
      
      * Remove tf in test
      
      * Update api document
      
      * Add zero_infinity option
      
      * refine
      
      * Add 1n2d test case
      
      * Switch to consistent_view
      
      * Fix Eager mode
      
      * Remove duplicate license
      
      * Add grad check
      
      * Fix bw test
      
      * Fix bugs
      
      * Add op name
      
      * Refine
      
      * of_format
      
      * Expand annotation
      
      * Performance optimizing for cuda
      
      * Check input_length & target_lengths
      
      * Update __syncthreads
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      d97b0218
  8. 08 2月, 2021 5 次提交
  9. 07 2月, 2021 2 次提交
  10. 05 2月, 2021 3 次提交
    • J
      Add Operator::InferInplaceObn2IbnIf (#4191) · 238e8bfe
      Juncheng 提交于
      * Add Operator::InferInplaceObn2IbnIf
      
      * remove useless header
      
      * make InferInplaceObn2Ibn protected
      238e8bfe
    • I
      Add ReplicationPad2D op (#4190) · f7a95d42
      iamyf 提交于
      * add pad2d ops and kernels
      
      * fix bug
      
      * add python api and unittest
      
      * reformat, change padding_data_type seq, change dim2vector method usage
      
      * fix typo
      
      * delete ShapeViewToDimVector
      
      * rerun make of_format
      f7a95d42
    • L
      Implementation of SavedModel and InferenceSession (#4066) · 56cd6a38
      leaves-zwx 提交于
      * save model and load model demo
      
      * fix
      
      * tensor.proto and copy signature
      
      * pass test
      
      * add load_saved_model function for InferenceSession
      
      * wait_for_all_jobs_finished
      
      * test_alexnet_save_and_load
      
      * support change batch_size
      
      * support batch axis
      
      * add ci test
      
      * revert job_build_and_infer_ctx api
      
      * simplify test script
      
      * following update
      
      * improve search function of InferenceSession
      
      * fix break update
      
      * add cv2 to dev-requirements
      
      * Update Dockerfile
      
      * rm 3.5
      
      * fix
      
      * quick workaround
      
      * speed up bazel
      
      * port changes
      
      * revert workaround
      
      * rm batch_axis in JobInputDef and JobOutputDef
      
      * rm export for ImageNetRecordDataset
      
      * refine Complete api for GraphBuilder and SignatureBuilder
      
      * refine check op is moirrored
      
      * fix by review comment ci test
      
      * InferenceSession is not responsible to destroy env
      Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: NTsai <caishenghang@oneflow.org>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      56cd6a38
  11. 03 2月, 2021 6 次提交
  12. 02 2月, 2021 4 次提交
  13. 01 2月, 2021 2 次提交
  14. 31 1月, 2021 1 次提交
    • S
      Refine manylinux dockerfile (#4141) · 062d802a
      Shenghang Tsai 提交于
      * manylinux docker use pip_args
      
      * optional bazel url
      
      * move args
      
      * fix repo url
      
      * reorder cmd
      
      * fix github case
      
      * update manylinux sha
      
      * http proxy lower case
      
      * rm err msg
      
      * mv msg
      
      * fix case
      
      * add exit 1
      
      * disable centos-sclo-rh
      
      * centos-sclo-rh skip_if_unavailable
      
      * MANYLINUX_SHA
      
      * Update Dockerfile
      
      * Update Dockerfile
      
      * refine
      
      * use ali
      
      * port more changes
      
      * use oneflow url
      
      * it works
      
      * add rsync
      
      * reorder
      
      * refine
      
      * refine
      
      * refine
      
      * use mirrot install cpython
      
      * larger tol
      Co-authored-by: NTsai <caishenghang@oneflow.org>
      062d802a