1. 26 2月, 2021 2 次提交
    • J
    • qq_22305325's avatar
      Mig id util and scope util (#4217) · a66b20e8
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * mig OpKernelObject
      
      * mig object_storage
      
      * make of_format
      
      * del comment
      
      * std::function<void(Object*)
      
      * mig NewOpKernelObject and _StatefulCallOpKernel
      
      * mig _StatefulCallOpKernel and GetSharedOpKernelObject4ParallelConfSymbol
      
      * del object_storage.cpp
      
      * use name GLOBAL_PARA_SYM2SHARED_OPKENEL_OBJ_MUTEX
      
      * mig CheckRefInBlobObjectParallelDesc and  OperandBlobObjects rel api
      
      * mig _StatelessCall
      
      * mig _StatelessCall
      
      * del comment
      
      * mig id_util and scope_util
      
      * use cfg_op_conf and Object*
      
      * use Object*
      
      * del _
      
      * fix func name error
      
      * use MapAt and shared_ptr
      
      * use shared_ptr or const ref
      
      * minor fix
      
      * add todo
      
      * minor fix
      
      * minor djustment
      
      * minor fix
      
      * minor fix
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      a66b20e8
  2. 25 2月, 2021 9 次提交
    • qq_22305325's avatar
      Remove machine ctx (#4254) · 41d2bde5
      qq_22305325 提交于
      * add CtrlConf Proto
      
      * add HostListBootStrapClient
      
      * add HostListBootStrapServer
      
      * del OfOnceCall in host_list_boot_strap_client
      
      * add BootStrapServer/Client
      
      * Update control.proto
      
      del rank2ctrl_addr
      
      * add InitConfFromEnvDesc
      
      * add log
      
      * optimize code
      
      * add CHECK
      
      * InitCtrlConfFromEnvDesc
      
      * del useless args def
      
      * CtrlBootstrap
      
      * minor fix
      
      * refactor CtrlServer/CtrlClient with ProcessCtx
      
      * RankInfoBootstrap
      
      * fix bug and optimize
      
      * minor optimize
      
      * minor fix
      
      * use WorkerProcessInfo
      
      * minor optimize
      
      * remove MachineCtx
      
      * del head file include
      
      * use GlobalProcessCtx
      
      * fix test bug
      
      * rename api and refactor EnvDesc::TotalMachineNum()
      
      * del GetCtrlAddr fimaly api in GlobalProcessCtx
      
      * fix namespace name
      Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      41d2bde5
    • Y
      Tensor for autograd (#4257) · d27e6391
      Yinggang Wang 提交于
      * initial tensor
      
      * combine blobdesc in tensor, initial MirrorTensor
      
      * fix typo
      
      * add consistent blob, tensorimpl and outer consistent tensor
      
      * add public method for tensor
      
      * remove unrelated file change
      
      * remove unrelated file change
      
      * fix: reviewer suggestions
      
      * fix: reviewer suggestions
      
      * Delete device.cpp
      
      * Add storage relative methods
      
      * Remove unchange file
      
      * Remove unchange file
      
      * add storage relative tensor interfaces
      
      * add some basic setter function
      
      * remove expose blob concept
      
      * remove useless header
      
      * remove useless header
      
      * add detail for new tensor implementation without blob desc
      
      * reoriganized
      
      * reoriganized
      
      * add get/set for blob object
      
      * refactor device remove is_lazy()
      
      * replace method in tensor
      
      * refactor device
      
      * refactor base tensor
      
      * remove header
      
      * refactor tensor impl
      
      * refactor tensor and device
      
      * refactor tensor impl
      
      * getters should return reference
      
      * add final
      
      * refactor tensor impl
      
      * add final for tensor impl
      
      * fix typo add #endif
      
      * unsolved comments
      
      * remove useless header
      
      * add protected method for tensor impl
      
      * compile and modified getters to return immutable parameters
      
      * code format
      
      * carefully deal with const property
      
      * feat(Tensor): add TensorArg
      
      * remove constness of blob object, rename parallel_conf to parallel_desc
      
      * feat(Tensor): add interface in tensor for autograd
      
      * feat(TensorArg): update codes
      
      * feat(Tensor): move grad_fn_node to tensor
      
      * feat(Tensor): grad_fn_node use const
      
      * feat(Tensor): grad_fn_node use const
      
      * feat(Tensor): update codes
      
      * feat(Tensor): add comment in Tensor
      
      * feat(Tensor): update codes
      Co-authored-by: NpoohRui <yuruil@qq.com>
      Co-authored-by: Nliyurui <32978179+poohRui@users.noreply.github.com>
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      d27e6391
    • J
      83cc313c
    • C
      Remove old version ofrecord load / decode (#4267) · bc81dff0
      cheng cheng 提交于
      * old decoder api use new user op impl
      
      * Remove ofrecord_load and decode_ofrecord
      
      * refine code for review
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      bc81dff0
    • S
      Dev multi optimizer (#4239) · 9b1f2176
      scxfjiang 提交于
      * update train conf
      
      * python front end draft
      
      * deprecated primary/secondary learning rate
      
      * update
      
      * update
      
      * backend draft
      
      * update
      
      * backup to remote work
      
      * fix compile issue in auto learning rate
      
      * fix job build and infer ctx
      
      * pass compile
      
      * use new get variable API
      
      * test python frontend
      
      * add naive multi-sgd optimizer test script
      
      * formal test script
      
      * update
      
      * refine code style
      
      * fix typo
      
      * update comments
      
      * format on OF server
      
      * fix comment
      
      * fix by CI test
      
      * update
      
      * format on OF server
      
      * refine by review
      
      * format on OF server
      
      * fix
      
      * format on OF server
      
      * change return value type from Sequence to List
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      9b1f2176
    • S
      Turn off Aliyun 3rd party backup (#4261) · b6706430
      Shenghang Tsai 提交于
      * Update test.yml
      
      * Update test.yml
      b6706430
    • W
    • qq_22305325's avatar
      Mig stateless call (#4215) · 8c0dca90
      qq_22305325 提交于
      * mig parallel_conf_util
      
      * mig BuildInitialScope BuildScopeWithNewParallelDesc BuildScopeWithNewParallelConf
      
      * add test of GetDeviceTagAndMachineDeviceIds
      
      * mig GetOpConfSymbol
      
      * fix BuildScopeWithNewParallelDesc input type error
      
      * use TRY
      
      * use symbol::Storage<OperatorConfSymbol>
      
      * _NewOpKernelObject
      
      * mig OpKernelObject
      
      * mig object_storage
      
      * make of_format
      
      * del comment
      
      * std::function<void(Object*)
      
      * mig NewOpKernelObject and _StatefulCallOpKernel
      
      * mig _StatefulCallOpKernel and GetSharedOpKernelObject4ParallelConfSymbol
      
      * del object_storage.cpp
      
      * use name GLOBAL_PARA_SYM2SHARED_OPKENEL_OBJ_MUTEX
      
      * mig CheckRefInBlobObjectParallelDesc and  OperandBlobObjects rel api
      
      * mig _StatelessCall
      
      * mig _StatelessCall
      
      * del comment
      
      * use cfg_op_conf and Object*
      
      * use Object*
      
      * del _
      
      * fix func name error
      
      * use MapAt and shared_ptr
      
      * use shared_ptr or const ref
      
      * minor fix
      
      * add todo
      
      * minor fix
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      8c0dca90
    • C
      Reduce unittest cost (#4253) · 2981ca73
      cheng cheng 提交于
      * reduce unittest cost
      
      * format
      
      * refactor unittest for reduce cost
      
      * refine code for review
      
      * moments test double
      
      * rollback top k shape dim
      
      * fix axis err in test_prelu
      Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
      2981ca73
  3. 24 2月, 2021 7 次提交
  4. 23 2月, 2021 8 次提交
  5. 22 2月, 2021 4 次提交
  6. 21 2月, 2021 1 次提交
  7. 20 2月, 2021 5 次提交
  8. 19 2月, 2021 3 次提交
  9. 18 2月, 2021 1 次提交
    • C
      NCCL use compute stream to memory cost & speed up (#4221) · 45697b0c
      cheng cheng 提交于
      * Enable insert nccl logical op pass
      
      * FindMaxConnectedSubgraphForGpuExecOrder~
      
      * through order and interface
      
      * implement of insert nccl logical op in pass
      
      * add nccl logical op using UserOp Implement and EagerNcclCommMgr
      
      * add NCCL ReduceScatter op/kernel; refine pass impl of topo order
      
      * add NCCL logical op/kernel AllGather
      
      * fix bug of reduce scatter/ all gather infer shape
      
      * refine log and note
      
      * fix complier err build with CPU ONLY
      
      * support NCCL ALL2ALL and test pass of alexnet model parallel
      
      * rollback of diff in checkpointing_pass.cpp
      
      * rename to nccl_use_compute_stream; ResourceDesc::nccl_use_compute_stream; refine name for review; create nccl_comm_ in KernelCompute;
      
      * refine code for review
      
      * add unittest for nccl use compute stream
      
      * format test scripts
      
      * refine align
      45697b0c