- 01 11月, 2021 1 次提交
-
-
由 Zhanghuihong Guan 提交于
* initial commit, add code for async construct tensor from numpy array * inital commit to change Maybe to Optional * delete redundant code * replace Maybe with Optional * fix compile errors * format code * changes based on review * format code, fix based on review * format code * fix multiclient type * changes based on review * changes based on review * unify calling to IsMultiClirnt * refector multi_client related code * restore InMultiClient interface * double check for unnecessary changes * remove unnecessary changes * format code * Update oneflow/api/python/symbol/job_conf_symbol.cpp * Update oneflow/api/python/symbol/op_conf_symbol.cpp * Update oneflow/api/python/symbol/op_node_signature_symbol.cpp * Update oneflow/core/common/optional.h * Update oneflow/api/python/symbol/string_symbol.cpp * Update oneflow/api/python/symbol/scope_symbol.cpp * Update oneflow/api/python/symbol/placement_symbol.cpp * Update oneflow/api/python/symbol/op_conf_symbol.cpp Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: NTwice <i@twice.moe>
-
- 23 9月, 2021 1 次提交
-
-
由 Juncheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 11 9月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Fix bug of Multi-Client src tick output order * Add input/output ctrl edge to DstSubTick for order io and callback_notify * add test scripts * remove note * auto format by CI * add note of sleep * auto format by CI Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
-
- 08 9月, 2021 1 次提交
-
-
由 Juncheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 06 9月, 2021 1 次提交
-
-
由 Juncheng 提交于
* Remove IDMgr::GetGpuPhyIdFromThrdId/IDMgr::GetDeviceTypeFromThrdId * CHECK(new_task_id_) Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 17 8月, 2021 1 次提交
-
-
由 Juncheng 提交于
* Remove GlobalWorkStreamId/GlobalThrdId * refine Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 15 8月, 2021 1 次提交
-
-
由 Tianyu Zhao 提交于
* Rename `ParallelDistribution` to `NdSbp` * Rename `ParallelDistribution` to `NdSbp` * Rename `ParallelDistribution` to `NdSbp` * auto format by CI * Rename `ParallelDistribution` to `NdSbp` Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 12 8月, 2021 1 次提交
-
-
由 Tianyu Zhao 提交于
* Rename `parallel_distribution` to `nd_sbp` * Rename filenames containing `parallel_distribution` * auto format by CI * Rename `parallel_distribution` to `nd_sbp` * auto format by CI * Rename `parallel_distribution` to `nd_sbp` * auto format by CI Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 02 8月, 2021 1 次提交
-
-
由 Luyang 提交于
* 0-dim tensor support * test case * add more test * refine * update * update default constructor * reconstuct * merge master * remove notes * remove useless codes * fix comments * fix comment * add test case * format * refine * refine * refine * refine * MirroredTensorMeta::MirroredTensorMeta() * support 0-dim slice * support 0-dim slice grad * refine * auto format by CI * refine * refine * auto format by CI * refine * fix slice bug * auto format by CI * fix resnet50 0-im loss uasge * fix 0-dim tensor usage in test cases * add skip test * auto format by CI * fix test_dataset * check blobdesc.shape init * auto format by CI * remove useless empty shape init * fix l1loss 0-dim error * auto format by CI * fix argmax op test * fix add_n op test * auto format by CI * fix bce loss op test * auto format by CI * fix squeeze op test * fix conv2d op test * fix xpu_shape for clip_grad_norm * auto format by CI * resolve confilct * fix multi-cpu slice_copier 0-dim bug * auto format by CI * add memory copy for 0-dim * auto format by CI * support copy0dim * refine * auto format by CI * remove unuse codes * fix check for kldivloss * gpu 0-dim copy * auto format by CI * fix clip_grad_norm doctest * fix reduce_ops doctest * fix argmax doctest * fix loss module doctests * fix math_ops doctests * fix norm modules doctest Co-authored-by: NXinqi Li <lixinqi0703106@163.com> Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 16 7月, 2021 1 次提交
-
-
由 Li Xinqi 提交于
* refactor job_pass by maybe_system * remove useless files Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 21 6月, 2021 1 次提交
-
-
由 leaves-zwx 提交于
* MemZoneId Former-commit-id: 7550a129f15554c5a6e480b728079e431c00be25 * move mem zone id source code Former-commit-id: 3859fc2a0fcda2fb23e57e886a0e3f1c0833d111 * revert Former-commit-id: 5cf3ad7caebe787918d1ca1c0467415656d9b491 * refine GetProxyNode using MemZoneId Former-commit-id: fba035f20b44b1acce2900b86b5bd24654e0d982 * refactor MemZoneId121 Former-commit-id: 0868a6139f1cf20dc7474d0a88714e03721c8e8e * replace using IDMgr interface Former-commit-id: 98b5db9ed879cd1d8197efd174c6d680bec69560 * fix linkage * rm useless comment * replace IsGpuMemZone * format * rm deprecated mem zone api in IDMgr * fix merge conflict error * refine mem zone id to include node index * revert added header * direct init device_id * address review * address review Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 15 6月, 2021 1 次提交
-
-
由 liufengwei0103 提交于
* refactor SbpXXX to cfg::SbpXXX * modify ParallelDistributionHint4InputArgNameAndIndex to be const function * fix sbp to cfg::sbp in job_pass * fix bug ToProto, InitFromProto and pb passed to cfg * auto format by CI * fix gpt segment fault * fix xla * tmp commit * tmp commit * fix xla compile error * [fix bug] return tmp in model_io_v2 * auto format by CI Co-authored-by: Nlixinqi <lixinqi0703106@163.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: NpoohRui <yuruil@qq.com>
-
- 05 5月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Fw/Bw support double compute stream * NCCL comm create by stream id * 2D NCCL logical kernel support BW independent stream * StreamIndex: NcclComputeStream for each subgraph insert nccl logical. * refactor code * refine code for review * Add WITH_CUDA in DoJobPass(InsertNcclLogicalOpPass)
-
- 29 4月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Pipeline Parallelism: checkpointing insert identity buffer op * fix complier err * identity buffer op custom out regst num * fix bug and runnable * Chain merge divide fw/bw; MemChain ignore merge; copyhd regst num hack * Pipeline buffer pass * Pipeline runnable * rollback NOT merge mem chain hack * pipeline_stage_id_hint and rollback checkpointing buffer * Pipeline buffer only. test pass. * rollback repeat hack * Remove CopyHd Hack; Add buffer cross label loader and loss * refine code for review & fix for new dtype infer * add note Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 19 4月, 2021 2 次提交
-
-
由 cheng cheng 提交于
* NCCL logical refine timeshape * Insert nccl ops after acc interface * Inser NCCL ops after acc implement; need refine or add new acc_tick_op * deadlock * speed up and run * add acc tick fix deadlocak ; and add nccl comm debug log * refine log: rm cc_debug_log and cclog * use reference for speed up * refine code for review * fix for review Co-authored-by: NJuncheng <liujuncheng1022@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 cheng cheng 提交于
* Remove RtBlobDesc * refine code for RuntimeBlobShapeInferHelper::BlobDesc4BnInOp Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 12 4月, 2021 1 次提交
-
-
由 Juncheng 提交于
-
- 07 4月, 2021 1 次提交
-
-
由 Juncheng 提交于
* Fix include cuda header * Fix
-
- 31 3月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Insert NCCL logical op pass support hierarchy * Add NCCL logical 2D SBP op/kernel support (*P)->(*B) * Add NCCL logical 2D SBP op/kernel support (P*)->(B*) * Fix bug and support (*, S(0)) -> (*, B) [dim1:AllGather] and (*, S(in)) -> (*, S(out)) [dim1:All2All] * Fix BUG and runnable * fix hierarchy equal bug
-
- 25 3月, 2021 1 次提交
-
-
由 cheng cheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 23 3月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* half implement of build task graph by 1regst1blob * Complete support 1 regst 1 blob * fix check * Add Lbis in TaskEdge and check valid * reduce NormalForward out regst name prefix * refine hasher and proxy key * fix bug of ProxyKey == * fix bug of collective boxing broadcast task edge lbi Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 19 3月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Half implement of remove LogicalGraph by OpGraph * Remove Logical Graph/Node * rename CompTaskNode::op and src_logical * fix compiler err of xrt * fix bug of REIGSTER in graph dir Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 16 3月, 2021 1 次提交
-
-
由 guo ran 提交于
* Add hierarchical_sub_task_graph_builder * fix * fix * fix * refine * refine * refine * refine Co-authored-by: Ncheng cheng <472491134@qq.com>
-
- 15 3月, 2021 2 次提交
-
-
由 Wang Tuo 提交于
* XXId structs and IdUtil * rm useless header * update id_util by discuss * update generate common thrd id and independent thrd id by IdUtil api * minor update * use IdUtil to generate task id in UpdateTaskId * Global<IdUtil> * emplace CommNetThrdId and TickTockThrdId call * implement IDMgr MemZoneId related api with IdUtil MemZoneId api * add GenerateChainId api * replace IDMgr api with IdUtil * rm useless header * revert IDMgr mem_zone_id api * rm redefinition of GetGpuPhyIdFromMemZoneId * modify by review comment * safety modification * def TaskType hash function * XXId structs and IdUtil * rm useless header * update id_util by discuss * update generate common thrd id and independent thrd id by IdUtil api * minor update * use IdUtil to generate task id in UpdateTaskId * Global<IdUtil> * emplace CommNetThrdId and TickTockThrdId call * implement IDMgr MemZoneId related api with IdUtil MemZoneId api * add GenerateChainId api * replace IDMgr api with IdUtil * rm useless header * revert IDMgr mem_zone_id api * rm redefinition of GetGpuPhyIdFromMemZoneId * modify by review comment * safety modification * def TaskType hash function * rm old test * fix by self review * change name * fix typo and enhance error info * refactor thread manager * more check * rm AllocateCpuThrdIdEvenly * refactor StreamId and rm IdUtil * stream index generator * modify by review * update stream index * update id util * update comm net task node * add TaskIdGenerator * update task id generation * replace gen thrd_in in logical node * replace GetGpuComputeThrdId in boxing sub task graph builder * replace h2d and d2h thrd_id in CopyHdTaskNode * replace h2d and d2h thrd_id in SliceBoxingSubTskGphBuilder * update id_util header * CHECK NOTNULL stream index generator * add chain_id_generator * rm IdUtil Glabol New * rm stream type in thread manager * CHECK_NOTNULL stream_index_generator in logical node * update id manager * update id_util * fix compile errors * tidy code * tidy code * revert format * mv std::hash<TaskType> to task_node.h * use unique_ptr to manage thread * fix typo * format * modify by review * start up * rm chain id generator * move id serialization to independent implementation * rm useless friend * fix compile error under gcc 4.8 * rm IsXxxStreamIndex * rm deprecated api in IDMgr * fix bug in CPUStreamIndexGenerator::GenerateComputeStreamIndex * refine id structs * refine id struct serialization * refine task id generator * refine StreamIndexGeneratorManager * refine copy task node * refine collective boxing sub task graph builder * refine slice boxing sub task graph builder * refine naive b2p sub task graph builder * refine logical node * refine id manager * refine thread manager * rm useless comment * remove magic number * revise header to be compatible with cpu-only compilation * more readable * fix bug * refine code * use HashCombine * replace type of bit shift const value with size_t * add testcase for fake dev * refactor mem_zone_id * reformat * add fake device allocator/deallocator * task_node InitProducedRegstMemCase add fakedev * Add stream_index_getter * format and fix tick tock task type Signed-off-by: Ndaquexian <daquexian566@gmail.com> * skip fake device test for now Signed-off-by: Ndaquexian <daquexian566@gmail.com> * refine Memcpy for fake dev * update for debug Signed-off-by: Ndaquexian <daquexian566@gmail.com> * some update for fake device Signed-off-by: Ndaquexian <daquexian566@gmail.com> * remove debug code Signed-off-by: Ndaquexian <daquexian566@gmail.com> * reg for fake device creating thread * minor fix * format * refine stream index getter Signed-off-by: Ndaquexian <daquexian566@gmail.com> * for debug * refine stream_index_getter * fix the code * delete fakedev unit test script * delete the code which is no relationship with stream_index_getter * delete test_tmp_dir * fix format * move xxx_compute_task_node.h from folder graph_impl to folder graph Co-authored-by: Nleaves-zwx <kunta0932@gmail.com> Co-authored-by: Nyaochi <later@usopp.net> Co-authored-by: NLdpe2G <liangdepeng@gmail.com> Co-authored-by: Ndaquexian <daquexian566@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 cheng cheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 10 3月, 2021 1 次提交
-
-
由 cheng cheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 02 3月, 2021 2 次提交
-
-
由 cheng cheng 提交于
* Remove AreaId * refine check for scope symbol id * refine logical node macro * rollback error change in group_boxing_by_dst_parallel Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 leaves-zwx 提交于
* XXId structs and IdUtil * rm useless header * update id_util by discuss * update generate common thrd id and independent thrd id by IdUtil api * minor update * use IdUtil to generate task id in UpdateTaskId * Global<IdUtil> * emplace CommNetThrdId and TickTockThrdId call * implement IDMgr MemZoneId related api with IdUtil MemZoneId api * add GenerateChainId api * replace IDMgr api with IdUtil * rm useless header * revert IDMgr mem_zone_id api * rm redefinition of GetGpuPhyIdFromMemZoneId * modify by review comment * safety modification * def TaskType hash function * XXId structs and IdUtil * rm useless header * update id_util by discuss * update generate common thrd id and independent thrd id by IdUtil api * minor update * use IdUtil to generate task id in UpdateTaskId * Global<IdUtil> * emplace CommNetThrdId and TickTockThrdId call * implement IDMgr MemZoneId related api with IdUtil MemZoneId api * add GenerateChainId api * replace IDMgr api with IdUtil * rm useless header * revert IDMgr mem_zone_id api * rm redefinition of GetGpuPhyIdFromMemZoneId * modify by review comment * safety modification * def TaskType hash function * rm old test * fix by self review * change name * fix typo and enhance error info * refactor thread manager * more check * rm AllocateCpuThrdIdEvenly * refactor StreamId and rm IdUtil * stream index generator * modify by review * update stream index * update id util * update comm net task node * add TaskIdGenerator * update task id generation * replace gen thrd_in in logical node * replace GetGpuComputeThrdId in boxing sub task graph builder * replace h2d and d2h thrd_id in CopyHdTaskNode * replace h2d and d2h thrd_id in SliceBoxingSubTskGphBuilder * update id_util header * CHECK NOTNULL stream index generator * add chain_id_generator * rm IdUtil Glabol New * rm stream type in thread manager * CHECK_NOTNULL stream_index_generator in logical node * update id manager * update id_util * fix compile errors * tidy code * tidy code * revert format * mv std::hash<TaskType> to task_node.h * use unique_ptr to manage thread * fix typo * format * modify by review * rm chain id generator * move id serialization to independent implementation * rm useless friend * fix compile error under gcc 4.8 * rm IsXxxStreamIndex * rm deprecated api in IDMgr * fix bug in CPUStreamIndexGenerator::GenerateComputeStreamIndex * refine id structs * refine id struct serialization * refine task id generator * refine StreamIndexGeneratorManager * refine copy task node * refine collective boxing sub task graph builder * refine slice boxing sub task graph builder * refine naive b2p sub task graph builder * refine logical node * refine id manager * refine thread manager * rm useless comment * remove magic number * revise header to be compatible with cpu-only compilation * more readable * fix bug * refine code * use HashCombine * replace type of bit shift const value with size_t * rm ProcessId and make rank as member of DeviceId * update id serialization with ProcessId update * make type definition local namespace * rm ProcessId in task graph * update DeviceId usage in logical node * update DeviceId usage in id manager * update rank usage in ThreadMgr * minor change * detail modification * tidy header * tidy header Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 26 2月, 2021 1 次提交
-
-
由 cheng cheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 20 2月, 2021 2 次提交
-
-
由 OuYang Yu 提交于
* remove PartialTick * remove op_conf.has_partial_tick_conf() Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 leaves-zwx 提交于
* rm LocalWorkStreamId * rm AllocateLocalWorkStreamId in TaskNode * rm local work stream id in task node and commnet task node * rm local_work_stream_id param in NewTaskId * fix test Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 19 2月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Remove keep_header_only and BlobDesc::is_body_disabled * Remove InputBlobModifier::use_header_only and UserOps set_use_header_only
-
- 18 2月, 2021 1 次提交
-
-
由 cheng cheng 提交于
* Enable insert nccl logical op pass * FindMaxConnectedSubgraphForGpuExecOrder~ * through order and interface * implement of insert nccl logical op in pass * add nccl logical op using UserOp Implement and EagerNcclCommMgr * add NCCL ReduceScatter op/kernel; refine pass impl of topo order * add NCCL logical op/kernel AllGather * fix bug of reduce scatter/ all gather infer shape * refine log and note * fix complier err build with CPU ONLY * support NCCL ALL2ALL and test pass of alexnet model parallel * rollback of diff in checkpointing_pass.cpp * rename to nccl_use_compute_stream; ResourceDesc::nccl_use_compute_stream; refine name for review; create nccl_comm_ in KernelCompute; * refine code for review * add unittest for nccl use compute stream * format test scripts * refine align
-
- 08 2月, 2021 1 次提交
-
-
由 Li Xinqi 提交于
* source subset tick * remove useless header files * insert DstSubsetTickOp * remove incorrect CHECK * TryBindBnWithOneofRegst * fix typo in task_graph * rename and refactor DstSubsetTick::InferBlobDescs and SrcSubsetTick::InferBlobDescs Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 05 2月, 2021 1 次提交
-
-
由 Juncheng 提交于
* Add Operator::InferInplaceObn2IbnIf * remove useless header * make InferInplaceObn2Ibn protected
-
- 03 2月, 2021 1 次提交
-
-
由 guo ran 提交于
* refactor boxing_sub_task_builder * refine * refine * refine * refine Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 26 1月, 2021 1 次提交
-
-
由 Juncheng 提交于
-
- 30 11月, 2020 1 次提交
-
-
由 Juncheng 提交于
* Add NaiveB2PSubTskGphBuilder * refine * refine * refine * refine Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 27 11月, 2020 1 次提交
-
-
由 cheng cheng 提交于
* using new chain aglorithm * fix bug of chain merge * fix bug of bfs search * fix order of rm empty adn chain merge * Try NOT merge in MemChain * using DfsTopoForEachNodeSortByDistanceToSink for set order in graph * fix compile err * rollback for topo order * using area id split optimizer with fw/bw chain * NOT consider tick in merge chain * use area id to split optimizer chain and fw/bw chain * remove note * refine code for review * make docker container stay live 1 hour Co-authored-by: NOuYang Yu <xuanjiuye@gmail.com> Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
-
- 21 10月, 2020 1 次提交
-
-
由 cheng cheng 提交于
* Remove CheckNoCycle in chain graph * remove hack code in oneflow.where op test * new rule for order in graph * remove ordered_chain_nodes_ in chain graph Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com> Co-authored-by: Noneflow-bot <69100618+oneflow-bot@users.noreply.github.com>
-