1. 17 8月, 2021 11 次提交
  2. 16 8月, 2021 11 次提交
  3. 15 8月, 2021 10 次提交
  4. 14 8月, 2021 8 次提交
    • B
      add flow.rand (#5722) · 1db57451
      Bowen Chen 提交于
      * add flow.rand
      
      * update docstr
      
      * update docstr
      
      * add consistent_rand, add more tests
      
      * update random op
      
      * refine
      
      * refine, add range and int type to uniform_kernel
      
      * refine
      
      * refine
      
      * update doc
      
      * update doc
      
      * Refactor UniformDistribution
      
      * fix
      Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      1db57451
    • L
      Bugfix async callback (#5881) · 3001d335
      Li Xinqi 提交于
      * SyncAccessBlobByCallback
      
      * refactor capture-by-reference to capture-by-value
      
      * refactor InstructionsBuilder::SyncAccessBlobByCallback
      Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      3001d335
    • C
    • S
      enable CMake first class cuda support (#5858) · d170a54a
      Shenghang Tsai 提交于
      * cmake first class cuda support
      
      * refine
      
      * refien
      
      * refine
      
      * refein
      
      * refein
      
      * refeine
      
      * refine
      
      * refein
      
      * refine
      
      * refien
      
      * refgine
      
      * refien
      
      * refein
      
      * refein
      
      * rm useless
      
      * refien
      
      * refein
      
      * also link cuda libs if build static
      
      * refein
      
      * refien
      
      * add
      
      * Revert "add"
      
      This reverts commit d9e67ad1.
      
      * fix
      
      * refeine
      
      * retine
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      d170a54a
    • Y
      Feat tensor to bool (#5836) · 45ec2370
      Yinggang Wang 提交于
      * feat(Tensor): support Tensor.__bool__()
      
      * test(Tensor): add tensor to bool test
      
      * docs(Tensor): refine is_nonzero document
      
      * format
      
      * fix(Tensor): fix Tensor.__bool___ bug
      
      * auto format by CI
      
      * fix(instancenorm): fix merge bug
      
      * fix(*): fix merge bugs
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
      Co-authored-by: Ncheng cheng <472491134@qq.com>
      45ec2370
    • L
      Tensor str (#5783) · 713d30e8
      liufengwei0103 提交于
      * refine code
      
      * refine code
      
      * optimize code
      
      * refine code
      
      * refine
      
      * back up
      
      * add tensor.to func
      
      * make of_format
      
      * remove to in pyTensor
      
      * sync gpu data
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * backup
      
      * refine
      
      * rebase
      
      * check in gen py
      
      * merge master and fix bugs
      
      * address pr comments
      
      * eager boxing
      
      * address pr comments
      
      * fix b2p error
      
      * auto format by CI
      
      * remove boxing
      
      * export sbp
      
      * add tensor to_consistent
      
      * /minor fix
      
      * minor fix
      
      * refine
      
      * remove useless head file
      
      * Fix optional
      
      * remove to in tensor.cpp
      
      * update
      
      * Support symbol placement type in functional.
      
      * add sbp and sbp list arg
      
      * refine
      
      * use functional
      
      * refactor CastConsistentOpExpr
      
      * to_consistent(flow.B) backward
      
      * Cache op expr
      
      * add EagerNcclOpKernelState
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * minor fix
      
      * capture OpInterpContext
      
      * unimplemented apply
      
      * add GetNdSbp
      
      * add mutex
      
      * refine
      
      * merge EagerConsistentTensorImpl::NewWithPhyTensor and EagerConsistentTensorImpl::NewWithoutPhyTensor into EagerConsistentTensorImpl::New
      
      * rename functiona SyncData to SyncMetaAndData
      
      * fix function yml
      
      * refine
      
      * refine
      
      * refine collective boxing
      
      * make of_format
      
      * of_format
      
      * add to_local to pybind
      
      * refactor EagerBoxingInterpreter
      
      * minor fix
      
      * optimize CastParallelDistribution
      
      * add placement_sbp_util
      
      * minor fix
      
      * eager boxing backward
      
      * minor fix
      
      * sync shape and data when tensor_to_local
      
      * fix rpc_token bugs
      
      * fix p2s backward bug
      
      * refactor AsyncRpcCtx
      
      * set logical_shape correctly
      
      * simplify implementation of consistent_tensor.to_local
      
      * refine
      
      * initialize rpc_token with zero
      
      * refactor grad functions of to_consistent/to_local
      
      * refine
      
      * reformat and address pr comment
      
      * reformat
      
      * add check_meta_consistency in consistent2sonsistent
      
      * refactor eager_nccl_reduce lernel
      
      * refine
      
      * refine to_consistent api
      
      * ban_non_pod_data_in_eager_boxing
      
      * refine
      
      * refine
      
      * refine
      
      * backup code
      
      * THREAD_LOCAL_CACHED
      
      * Delete thread_local_cache.h
      
      * bugfix: DeviceId4ParallelId -> MachineId4ParallelId
      
      * optimize
      
      * support tensor str
      
      * Init code and can print consistent
      
      * refine format
      
      * remove useless to_consistent and format
      
      * refine code and print according data
      
      * attempt to support multi rank when fetch data
      
      * Revert "attempt to support multi rank when fetch data"
      
      This reverts commit ae56afad.
      
      * skip if tensor is consistent
      
      * delete useless
      
      * add comment
      
      * delete useless
      
      * traversal data to determine if int_mode
      
      * if consistent, return [...]
      
      * refine
      
      * add test and fix bug
      
      * add more assertTrue and delete useless
      
      * getitem using integer return scalar when tensor shape is [1]
      
      * add test cast
      
      * refine
      
      * fix spelling mistake
      
      * add op test and enhance in parse device
      
      * fix bug
      
      * fix docstr test bug and support to print meta
      
      * refine
      
      * auto format by CI
      
      * fix docstr in clip_grad.py
      
      * fix docstr
      
      * fix docstr and bug
      
      * the input shape parameter of reshape changed
      
      * add with flow.no_grad when operate tensor
      
      * fix docstr
      Co-authored-by: qq_22305325's avatarclackhan <han_binbin@163.com>
      Co-authored-by: Ntsai <jackalcooper@gmail.com>
      Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
      Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
      Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: Nwyg1997 <wyg19970408@gmail.com>
      Co-authored-by: Ncheng cheng <472491134@qq.com>
      713d30e8
    • L
      Lazy to_consistent (#5774) · f4a7f739
      leaves-zwx 提交于
      * refine code
      
      * optimize code
      
      * refine code
      
      * refine
      
      * back up
      
      * add tensor.to func
      
      * make of_format
      
      * remove to in pyTensor
      
      * sync gpu data
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * backup
      
      * refine
      
      * rebase
      
      * check in gen py
      
      * merge master and fix bugs
      
      * address pr comments
      
      * eager boxing
      
      * address pr comments
      
      * fix b2p error
      
      * auto format by CI
      
      * remove boxing
      
      * export sbp
      
      * add tensor to_consistent
      
      * /minor fix
      
      * minor fix
      
      * refine
      
      * remove useless head file
      
      * Fix optional
      
      * remove to in tensor.cpp
      
      * update
      
      * Support symbol placement type in functional.
      
      * add sbp and sbp list arg
      
      * refine
      
      * use functional
      
      * refactor CastConsistentOpExpr
      
      * to_consistent(flow.B) backward
      
      * Cache op expr
      
      * add EagerNcclOpKernelState
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * minor fix
      
      * capture OpInterpContext
      
      * unimplemented apply
      
      * add GetNdSbp
      
      * add mutex
      
      * refine
      
      * merge EagerConsistentTensorImpl::NewWithPhyTensor and EagerConsistentTensorImpl::NewWithoutPhyTensor into EagerConsistentTensorImpl::New
      
      * rename functiona SyncData to SyncMetaAndData
      
      * fix function yml
      
      * refine
      
      * refine
      
      * refine collective boxing
      
      * make of_format
      
      * of_format
      
      * add to_local to pybind
      
      * refactor EagerBoxingInterpreter
      
      * minor fix
      
      * optimize CastParallelDistribution
      
      * add placement_sbp_util
      
      * minor fix
      
      * eager boxing backward
      
      * minor fix
      
      * sync shape and data when tensor_to_local
      
      * fix rpc_token bugs
      
      * fix p2s backward bug
      
      * refactor AsyncRpcCtx
      
      * set logical_shape correctly
      
      * simplify implementation of consistent_tensor.to_local
      
      * refine
      
      * initialize rpc_token with zero
      
      * refactor grad functions of to_consistent/to_local
      
      * refine
      
      * reformat and address pr comment
      
      * reformat
      
      * add check_meta_consistency in consistent2sonsistent
      
      * refactor eager_nccl_reduce lernel
      
      * refine
      
      * refine to_consistent api
      
      * ban_non_pod_data_in_eager_boxing
      
      * refine
      
      * refine
      
      * refine
      
      * backup code
      
      * THREAD_LOCAL_CACHED
      
      * Delete thread_local_cache.h
      
      * bugfix: DeviceId4ParallelId -> MachineId4ParallelId
      
      * optimize
      
      * minor fix
      
      * LazyInterpreterApplyImplForParallelCastOpExpr
      
      * rm eager constraint
      
      * c2c interp ctx with parallel info
      
      * multi client collective boxing
      
      * test_to_consistent
      
      * support to_consistent grad_sbp
      
      * AsConsistentTensor
      
      * pass bwd test
      
      * add multi graph test
      
      * add ConsistentToConsistentOpExpr
      
      * LazyConsistentToConsistent
      
      * interpret ConsistentToConsistentOpExpr
      
      * update test
      
      * rm useless code
      
      * auto format by CI
      
      * fix conflict
      
      * mod comment
      
      * add message for local_tensor.to_consistent() check and consistent_tensor.to_local() check in lazy
      
      * address review
      
      * fix conflict
      
      * rm check which limit placement changing
      
      * auto format by CI
      
      * fix nd_sbp
      
      * auto format by CI
      
      * refactor to.py
      
      * ConsistentToConsistentOpExpr catch free tensor
      
      * fix copy op's sbp inferring
      
      * refactor empty infer sbp
      
      * refactor constant infer sbp
      
      * mod coco reader sbp inferring
      
      * fix GetSbpFn
      
      * fix consistent_to
      
      * fix (#5857)
      Co-authored-by: Nleaves-zwx <kunta0932@gmail.com>
      
      * modify comments
      
      * add test_to_placement case
      
      * clear code
      
      * unready test
      
      * refactor with InferNdSbp4SrcOp
      
      * rm out-dated comment
      
      * tidy code
      
      * SBP str -> cfg::SbpParallel
      Co-authored-by: qq_22305325's avatarclackhan <han_binbin@163.com>
      Co-authored-by: Ntsai <jackalcooper@gmail.com>
      Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
      Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
      Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: NLiang Depeng <liangdepeng@gmail.com>
      f4a7f739
    • L
      Broadcast consistent shape and dtype (#5784) · 660a4c48
      Li Xinqi 提交于
      * GetBroadcastGroup
      
      * fix comment typo.
      
      * broadcast shape and dtype
      
      * 1) rm THREAD_LOCAL_CACHED; 2) fix bugs in ThreadLocal
      
      * fix wrong use of LocalRank
      
      * revert several code from master
      
      * fix compiler complain
      
      * merge master
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      660a4c48