1. 01 11月, 2021 1 次提交
    • Z
      Change maybe to optional (#6611) · 380d2414
      Zhanghuihong Guan 提交于
      * initial commit, add code for async construct tensor from numpy array
      
      * inital commit to change Maybe to Optional
      
      * delete redundant code
      
      * replace Maybe with Optional
      
      * fix compile errors
      
      * format code
      
      * changes based on review
      
      * format code, fix based on review
      
      * format code
      
      * fix multiclient type
      
      * changes based on review
      
      * changes based on review
      
      * unify calling to IsMultiClirnt
      
      * refector multi_client related code
      
      * restore InMultiClient interface
      
      * double check for unnecessary changes
      
      * remove unnecessary changes
      
      * format code
      
      * Update oneflow/api/python/symbol/job_conf_symbol.cpp
      
      * Update oneflow/api/python/symbol/op_conf_symbol.cpp
      
      * Update oneflow/api/python/symbol/op_node_signature_symbol.cpp
      
      * Update oneflow/core/common/optional.h
      
      * Update oneflow/api/python/symbol/string_symbol.cpp
      
      * Update oneflow/api/python/symbol/scope_symbol.cpp
      
      * Update oneflow/api/python/symbol/placement_symbol.cpp
      
      * Update oneflow/api/python/symbol/op_conf_symbol.cpp
      Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com>
      Co-authored-by: NTwice <i@twice.moe>
      380d2414
  2. 30 10月, 2021 1 次提交
    • H
      Refactor oneflow.Size (#6645) · 4be2b0a3
      Houjiang Chen 提交于
      * Refactor oneflow.Size
      
      * refine
      
      * add pybind11 caster
      
      * Support Shape cast
      
      * refine
      
      * fix size index
      
      * include size header if need export C++ Shape to Python.
      4be2b0a3
  3. 18 9月, 2021 1 次提交
  4. 03 9月, 2021 1 次提交
    • L
      Decompose nd sbp boxing (#5800) · 9c464a31
      Li Xinqi 提交于
      * GetBroadcastGroup
      
      * fix comment typo.
      
      * broadcast shape and dtype
      
      * 1) rm THREAD_LOCAL_CACHED; 2) fix bugs in ThreadLocal
      
      * fix wrong use of LocalRank
      
      * 1) a decorator for disabling recursive boxing call; 2) a decorator for checking consistent tensor meta.
      
      * don't set consistent_id when recursively calling eager consistent op interpreter.
      
      * decompose nd_sbp boxing
      
      * disable checking consistent tensor meta recursively.
      
      * GetDecomposableEquivalent
      
      * fix a unittest case bug
      
      * fix a bug in unittest
      
      * fix compiler complain
      
      * add unitests for CalcDecomposableEquivalentShapeAndNdSbpPair
      
      * InitNdSbpValidTransformationAxisSequence
      
      * DecomposeIntoNaiveTransformations
      
      * fix compiler complains
      
      * move several unitests in parallel_desc_test.cpp into placement_sbp_util_test.cpp
      
      * abstract_consistent_to_consistent_op_expr
      
      * fix compiler complaint
      
      * refactor consistent-to-consistent eager consisitent op interpreter
      
      * fix compiler complaint
      
      * refactor ConsistentToConsistentOpExpr
      
      * lazy interpreter (#5903)
      
      * fix bugs about consistent_id
      
      * refactor functional::ToConsistent
      
      * refactor GetNdSbp
      
      * fix compiler complaints
      
      * upgrade gtest and fix static check error
      
      * update head file index
      
      * fix bug
      
      * modify path of gtest lib
      
      * refactor NaiveNdSbpBoxingInterpreter to BoxingExpr(symmetric-nd-sbp-to-nd-sbp)
      
      * fix compiler complaints
      
      * Update gmock_headers.txt
      
      * Update gtest_headers.txt
      
      * fix bug about disable checking consistent meta in local to consistent functor
      
      * fix include bug
      Co-authored-by: qq_22305325's avatarclackhan <han_binbin@163.com>
      Co-authored-by: Nleaves-zwx <kunta0932@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: Nliufengwei <2472937968@qq.com>
      Co-authored-by: NTwice <i@twice.moe>
      Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
      9c464a31
  5. 28 8月, 2021 1 次提交
  6. 20 8月, 2021 2 次提交
  7. 16 8月, 2021 1 次提交
  8. 13 8月, 2021 1 次提交
  9. 06 8月, 2021 1 次提交
    • qq_22305325's avatar
      Inferface eager boxing (#5682) · 38570816
      qq_22305325 提交于
      * support_tensor_to/to_local
      
      * export consistent_tensor.to_local()
      
      * refine code
      
      * export tensor.to()...
      
      * refine code
      
      * refine code
      
      * optimize code
      
      * refine code
      
      * refine
      
      * back up
      
      * add tensor.to func
      
      * make of_format
      
      * remove to in pyTensor
      
      * sync gpu data
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * backup
      
      * refine
      
      * rebase
      
      * check in gen py
      
      * merge master and fix bugs
      
      * address pr comments
      
      * eager boxing
      
      * address pr comments
      
      * fix b2p error
      
      * auto format by CI
      
      * remove boxing
      
      * export sbp
      
      * add tensor to_consistent
      
      * /minor fix
      
      * minor fix
      
      * refine
      
      * remove useless head file
      
      * Fix optional
      
      * remove to in tensor.cpp
      
      * update
      
      * Support symbol placement type in functional.
      
      * add sbp and sbp list arg
      
      * refine
      
      * use functional
      
      * refactor CastConsistentOpExpr
      
      * to_consistent(flow.B) backward
      
      * Cache op expr
      
      * add EagerNcclOpKernelState
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * minor fix
      
      * capture OpInterpContext
      
      * unimplemented apply
      
      * add GetNdSbp
      
      * add mutex
      
      * refine
      
      * merge EagerConsistentTensorImpl::NewWithPhyTensor and EagerConsistentTensorImpl::NewWithoutPhyTensor into EagerConsistentTensorImpl::New
      
      * rename functiona SyncData to SyncMetaAndData
      
      * fix function yml
      
      * refine
      
      * refine
      
      * refine collective boxing
      
      * make of_format
      
      * of_format
      
      * add to_local to pybind
      
      * refactor EagerBoxingInterpreter
      
      * minor fix
      
      * optimize CastParallelDistribution
      
      * add placement_sbp_util
      
      * minor fix
      
      * eager boxing backward
      
      * minor fix
      
      * sync shape and data when tensor_to_local
      
      * fix rpc_token bugs
      
      * fix p2s backward bug
      
      * refactor AsyncRpcCtx
      
      * set logical_shape correctly
      
      * simplify implementation of consistent_tensor.to_local
      
      * refine
      
      * initialize rpc_token with zero
      
      * refactor grad functions of to_consistent/to_local
      
      * refine
      
      * reformat and address pr comment
      
      * reformat
      
      * add check_meta_consistency in consistent2sonsistent
      
      * refactor eager_nccl_reduce lernel
      
      * refine
      
      * refine to_consistent api
      
      * ban_non_pod_data_in_eager_boxing
      
      * refine
      
      * refine
      
      * refine
      
      * backup code
      
      * THREAD_LOCAL_CACHED
      
      * Delete thread_local_cache.h
      
      * bugfix: DeviceId4ParallelId -> MachineId4ParallelId
      
      * optimize
      
      * minor fix
      Co-authored-by: Ntsai <jackalcooper@gmail.com>
      Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
      Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
      Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      38570816
  10. 04 8月, 2021 2 次提交
  11. 31 7月, 2021 1 次提交
  12. 30 7月, 2021 1 次提交
  13. 13 7月, 2021 1 次提交
    • X
      Fea/nn graph/graph build ctx (#5420) · 67272d28
      Xiaoyu Xu 提交于
      * graph api
      
      * add graph dummy test
      
      * add test
      
      * add recursive module mode
      
      * graph.build test pass
      
      * add detail check on graph inner node
      
      * support config and train
      
      * add repr for debug
      
      * test buffer
      
      * test buffer add
      
      * refine test
      
      * add comment
      
      * refine test
      
      * refactor Node to Block
      
      * add named_state
      
      * refine Graph.named_state()
      
      * add state_tensortuple
      
      * graph._compile()
      
      * add mc session 0
      
      * nn.graph: state tuple to private var; add BlockType; add simple multi client session
      
      * NNGraphIf
      
      * rm old graph.cpp
      
      * nn.graph: add cpp NNGraph; export and call NNGraph
      
      * add comment
      
      * nn.Graph: rm prototype MultiClientSession
      
      * nn.Graph: rm prototype MultiClientSession test
      
      * nn.Graph: add TODO
      
      * nn.Graph: format for review
      
      * nn.Graph: format
      
      * nn.Graph: format
      
      * nn.Graph: pass flake8 check
      
      * job_build_ctx
      
      * support lazy context
      
      * format
      
      * lazy mode
      
      * format
      
      * format
      
      * lazy mode add test
      
      * debug session
      
      * init session and job_build_context
      
      * rm temp code
      
      * build default scope
      
      * add default scope
      
      * add scope proto for debug
      
      * chech scope
      
      * format
      
      * refine MultiClientSession.resource
      
      * address review
      
      * lazy init scope stack in single-client, instantly init scope stack after MultiClientSession created in multi-client
      
      * fix typo
      
      * address review
      
      * fix clear default session
      
      * merger and test
      Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: Nleaves-zwx <kunta0932@gmail.com>
      Co-authored-by: Ncheng cheng <472491134@qq.com>
      67272d28
  14. 07 7月, 2021 1 次提交
    • qq_22305325's avatar
      flow.S/B/P (#5306) · 00f12305
      qq_22305325 提交于
      * flow.S/B/P
      
      * optimize
      
      * optimize
      
      * fix according comment
      
      * add attr in experimental namespace
      
      * optimize
      
      * oneflow_export_value
      
      * refine
      
      * refine
      
      * refine
      
      * customized_symbol module
      
      * refine
      
      * fix bug
      
      * Add new placement init func (#5408)
      
      * add_new_placement_init_func
      
      * flow.env.all_device_placement
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      
      * refine
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      00f12305
  15. 15 6月, 2021 1 次提交
  16. 27 5月, 2021 1 次提交
  17. 27 4月, 2021 1 次提交
    • L
      Refactor physical run (#4713) · 8b0abc80
      Li Xinqi 提交于
      * mark todo
      
      * inline _Run to PhysicalRun and Logical
      
      * construct instruction_list's shared_ptr and input nullptr to id_generator
      
      * non-allocated space construction function
      
      * modify eager_symbol_list
      
      * modify vm::InstructionMsgList*
      
      * change std::shared_ptr<vm::InstructionMsgList> to vm::InstructionMsgList*
      
      * del istr_list and symbol_list in session
      
      * optimize
      
      * minor fix
      
      * use InstructionsBuilder*
      
      * fix eager run bug
      
      * optimize
      
      * fix bug with update master
      
      * fix bug
      
      * make of_format
      Co-authored-by: Nwanghongsheng <2496533749@qq.com>
      Co-authored-by: qq_22305325's avatarclackhan <han_binbin@163.com>
      8b0abc80
  18. 17 3月, 2021 1 次提交
  19. 10 3月, 2021 1 次提交
  20. 09 2月, 2021 1 次提交
    • qq_22305325's avatar
      Mig op conf sym (#4213) · aea03748
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * del comment
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      aea03748
  21. 28 1月, 2021 1 次提交
  22. 26 1月, 2021 1 次提交
    • qq_22305325's avatar
      Refactor string symbol (#4148) · a95088f0
      qq_22305325 提交于
      * mig EagerPhysicalBlobHeader
      
      * slove dtype
      
      * mig EagerPhysicalBlob partical
      
      * mig EagerBlobTrait
      
      * fix EagerBlobTrait shape property
      
      * add CHECK
      
      * slove numpy
      
      * enroll blob_trait
      
      * replace EagerPhysicalBlob with oneflow_api.EagerPhysicalBlob
      
      * replace LazyBlob with oneflow_api.LazyBlob
      
      * fix a SyntaxError
      
      * mig eager_blob
      
      * replace EagerBlob with oneflow_api.EagerBlob
      
      * move parallel_size to c++
      
      * Adjust parameter order
      
      * rename fun
      
      * adjust condition
      
      * del useless fun
      
      * refactor_string_symbol
      
      * run make of_format
      
      * rename blob_type to blob_class
      a95088f0
  23. 12 1月, 2021 1 次提交
    • qq_22305325's avatar
      Mig op arg para attr (#4102) · d3c8f0c0
      qq_22305325 提交于
      * GetPhysicalOpArgBlobAttr
      
      * cfg hash
      
      * fix bug
      
      * cfg::SbpParallel typed OpArgParallelAttribute.sbp_parallel
      
      * cfg::OptMirroredParallel typed OpArgParallelAttribute.opt_mirrored_parallel
      
      * rename PyLazyConsistentBlob and PyLazyMirroredBlob
      
      * fix EagerConsistentBlob bug of property parallel_size
      
      * use static_cast
      
      * del redefine hash
      
      * mig_op_arg_para_attr
      
      * mig DumpToInterfaceBlobConf
      
      * fix bug
      
      * fix inter_face_blob_conf.proto
      
      * replace None with oneflow_api.INVALID_BATCH_AXIS
      
      * migrate python OpNodeSignatureSymbol to c++ version
      
      * fix CONFLICT
      
      * mig DumpToOpNodeSignature
      
      * mig op_arg_util.py completely
      Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      d3c8f0c0
  24. 26 12月, 2020 3 次提交
    • L
      Scope with symbol (#4040) · 6626d998
      Li Xinqi 提交于
      * parallel desc with symbol_id
      
      * migrate ParallelDescSymbol
      
      * fix code format
      
      * fix bug in oneflow_testexe
      
      * Make oneflow worker docker stay alive for 6 hours
      
      * exception
      
      * except in pybind11 and python
      
      * finetune api
      
      * print traceback
      
      * fix bug
      
      * fix format
      
      * ParallelDesc::cfg_parallel_conf
      
      * remove traceback in test_checkpoint
      
      * fix python codeformat
      
      * del job_build_and_infer_cfg_error.py
      
      * optimize api struct
      
      * refactor JobDesc
      
      * migrate JobConfSymbol
      
      * del useless lines
      
      * del useless lines
      
      * add CompileOptionWrongError
      
      * add CompileOptionWrongError
      
      * replace python ScopeSymbol with cpp Scope
      
      * fix typo
      
      * fix bug
      
      * rename OF_COMPLIE_OPTION_EEEOR
      
      * fix conflict
      
      * fix format
      
      * fix bug
      
      * fix conflict
      Co-authored-by: qq_22305325's avatarclackhan <han_binbin@163.com>
      Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      6626d998
    • qq_22305325's avatar
      Job desc with symbol (#4032) · 9c1e4133
      qq_22305325 提交于
      * parallel desc with symbol_id
      
      * migrate ParallelDescSymbol
      
      * fix code format
      
      * fix bug in oneflow_testexe
      
      * Make oneflow worker docker stay alive for 6 hours
      
      * exception
      
      * except in pybind11 and python
      
      * finetune api
      
      * print traceback
      
      * fix bug
      
      * fix format
      
      * ParallelDesc::cfg_parallel_conf
      
      * remove traceback in test_checkpoint
      
      * fix python codeformat
      
      * del job_build_and_infer_cfg_error.py
      
      * optimize api struct
      
      * refactor JobDesc
      
      * migrate JobConfSymbol
      
      * del useless lines
      
      * del useless lines
      
      * add CompileOptionWrongError
      
      * add CompileOptionWrongError
      
      * rename OF_COMPLIE_OPTION_EEEOR
      
      * fix conflict
      Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
      Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
      9c1e4133
    • qq_22305325's avatar
      Parallel desc with symbol (#4017) · abdc6dea
      qq_22305325 提交于
      * parallel desc with symbol_id
      
      * migrate ParallelDescSymbol
      
      * fix code format
      
      * fix bug in oneflow_testexe
      
      * Make oneflow worker docker stay alive for 6 hours
      
      * exception
      
      * except in pybind11 and python
      
      * finetune api
      
      * print traceback
      
      * fix bug
      
      * fix format
      
      * ParallelDesc::cfg_parallel_conf
      
      * remove traceback in test_checkpoint
      
      * fix python codeformat
      
      * del job_build_and_infer_cfg_error.py
      
      * optimize api struct
      
      * add CompileOptionWrongError
      
      * rename OF_COMPLIE_OPTION_EEEOR
      Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
      Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      abdc6dea