- 13 11月, 2018 1 次提交
-
-
由 Jin Yi 提交于
* chore: remove pre compiler funcs * chore: add submoudles * fix: fix project build URL from git_url -> submodule_dir_url * fix: fix submodule commit id * fix: fix .gitmodules * chore: mv third_party dir * chore: remove test-driver(glog#188) link in glog submodule * fix: update glog from: da816ea70645e463aa04f9564544939fa327d5a7 ==> to: 4f3e18bf26cdb794fc66cec348f57b5838a0c929 * chore: update README.md Former-commit-id: 8cc052f38cfd53c40186dc487df41b0c1f4a7189
-
- 06 11月, 2018 1 次提交
-
-
由 cheng cheng 提交于
* random size crop proto * ImagePreprocessImpl::<kCropWithRandomSize> * clang format * MaxVal Former-commit-id: c027432320cc0f03248f9165994150fce058f00a
-
- 05 11月, 2018 1 次提交
-
-
由 Shiyuan Shang-Guan 提交于
Former-commit-id: 94be14b8189a7123a4012bb34727b32f7ec07599
-
- 30 10月, 2018 1 次提交
-
-
由 QiaoJing 提交于
* fix normlization epsilon check * remove check, fix eplison value in op_conf Former-commit-id: 8ad160577179646a4d83f47a40d5de275ad19952
-
- 29 10月, 2018 1 次提交
-
-
由 QiaoJing 提交于
Former-commit-id: 8111c6c82b09d725b2f520744b2e6b3809288c65
-
- 26 10月, 2018 2 次提交
- 25 10月, 2018 1 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* refine link ibverbs lib * modify minor Former-commit-id: a7e61a6704b38ca2d4957ee699fd5962be1eac75
-
- 21 10月, 2018 1 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* fix bug in gcc 5.4 * update Former-commit-id: f180aadf59e9866bd0e0c065726fe5b316efbca6
-
- 20 10月, 2018 1 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* fix conv in model parallel * add TODO Former-commit-id: 5ed0f04822a94ab9941fda91c8ba8fb18c36aeeb
-
- 19 10月, 2018 1 次提交
-
-
由 Jin Yi 提交于
* feat: enhance cmake download & options * feat(tools/): add share libs build scripts * fix: add cmake options * feat: add 3rd party download * chore: updat README * fix: fix protobuf & cmake repo * fix: fix options name * chore: merge 3rd_party.cmake & third_party.cmake * chore: revert pre cmake URL fix * chore: update ExternalProject check * fix: fix typo & missing download * fix: fix download url * chore: update readme * chore: fix typo * fix: fix bugs * fix: fix bugs * fix: fix pre * print all third party libs * refine readme * DOWNLOAD_THIRD_PARTY -> PRECOMPILED_THIRD_PARTY * refine readme * minor typo fix Former-commit-id: d7d1ec98a868c32e3a43658823ae136caa73feb5
-
- 17 10月, 2018 1 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* fix bug of snapshot * refine distribute.sh * use more accurate function calls * rename function * update for model parallel * refine code Former-commit-id: e0c2ad2b2dad82e0cb3adce6de9fba98f0c4434c
-
- 14 10月, 2018 1 次提交
-
-
由 Juncheng 提交于
Former-commit-id: 82681d523fa9e521e2c04b5fd32e6f435f9ba722
-
- 12 10月, 2018 3 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* refine portmap in epoll * refine code about sockfd * add log Former-commit-id: ca1903b9
-
由 Shiyuan Shang-Guan 提交于
* refine ctrl addr (ip and port) * update ctrl client&server * update ctrl client&server * update by comment * update example resource.prototxt Former-commit-id: 6de48fa5
-
由 Shiyuan Shang-Guan 提交于
* add scripts to tools/ * update scripts * update distribute.sh * Redirect stderr Former-commit-id: 749560c5
-
- 11 10月, 2018 1 次提交
-
- 09 10月, 2018 3 次提交
-
-
由 Shiyuan Shang-Guan 提交于
Former-commit-id: abee1b98
-
由 Shiyuan Shang-Guan 提交于
Former-commit-id: 312cfb10
-
- 05 10月, 2018 2 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* use hostname as log_dir_path and get this_machine_id through ip_addr * update by comment * fix ParseThisMachineId * fixbug * rm TODO Former-commit-id: a18f2912
-
由 Jinhui Yuan 提交于
* build nccl from source * refine * refine BUILD_CUDA Former-commit-id: dfd11137
-
- 03 10月, 2018 2 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* fix decode_random and refine synthetic_data * add example * initialize only once Former-commit-id: a1b44c05
- 02 10月, 2018 2 次提交
-
- 01 10月, 2018 5 次提交
-
-
由 Li Xinqi 提交于
* available instance num * import shape.proto * PodProto * rename message * union pod is useless * PodPtr * rename: PodPtr::get() => PodPtr::Get() * BlobDescProto.pod * mv register_desc.time_shape into another pr * pod_helper.h * FieldAlignedByteSize * pod_desc * PodDesc copy constructor * BlobDesc::body_shape_pod_desc_ * add BlobDesc::opaque_header_pod_desc_ * align_shift => alignment * default alignment * add field Blob::header_pod_ptr_ * rename AlignedFieldPodProto => FieldPodProto * bugfix * check * FieldId * simplify RtBlobDesc * simplify Blob * ShapedPod => TensorPod * refine ComputePackedBlobDesc Former-commit-id: 8800da93
-
由 Niu Chong 提交于
fix: add AsyncSednRegstMsgToConsumer() for send single produced regst, e.g. forward_model_regst (#1274) * fix(normal_model_update_compute_actor): fix send forward_model_regst_ to consumer * fix: add AsyncSednRegstMsgToConsumer() for send single produced regst, e.g. forward_model_regst Former-commit-id: 139c2241
-
由 Shiyuan Shang-Guan 提交于
* refine cudnn_limit_buf * rename default_cudnn_buf_limit_mbyte -> cudnn_buf_limit_mbyte Former-commit-id: 7390c2f7
-
由 Jinhui Yuan 提交于
Former-commit-id: 28f981eb
-
- 30 9月, 2018 1 次提交
-
-
由 Niu Chong 提交于
* feat(register_slot): add the RegstSlot * feat(register_slot): update RegstSlot if * feat(actor): update member of Actor to use RegstSlot * fix(register_slot): fix the available_regst_desc_cnt init val * refine(register_slot): rename PushBack/PopFront, FindTheRegstDescId to TryPushBack/TryPopFront, HasRegstDescId * feat(regst_slot): rename ForEachCurRegstDeq/ForEachCurFrontRegst to ForEachRegstDeq/ForEachFrontRegst * feat(regst_slot): add ForChosenRegstDeq/ForChosenFrontRegst, add CHECK empty in ForEachFrontRegst * fix(register_slot): fix the CHECK empty * feat: remove actual_writeable_regst_desc_id_ from Actor, add Naive/CustomizedProducedRegst * fix(normal_model_update_actor): bug: not send customized regst to consumer when SendIntialModel * fix(normal_forward_compute_actor): bug: not add kLoss/kAccuracy produced regst to NaiveProducedRegst * fix(actor): UNIMPLEMENTED() for AsyncSendCustomizedProducedRegstMsgToConsumer * fix(normal_forward_compute_actor): set const_buf_regst to nullptr when recv from consumers * fix(actor): total_reading_data_regst_cnt, not total_reading_ctrl_regst_cnt * refactor: update GetNaiveConsumedRegstDescName to GetNaiveOrCustomizedConsumedRegstDescName(same for Produced) * feat: combine data_regst and ctrl_regst in Actor * fix: fix bugs * fix: fix bugs * fix: remove .swp files and unused LOG * feat: split Act and SendMsg (#1255) * feat: split Act and SendMsg * refine: rename HandleProduced/ConsumedDataRegst.. to HandleProduced/ConsumedNaiveDatRegst.. * fix(input_wise_comp_actor): bug: not set piece id * fix(actor): potential bug: produced msg with no allowed actor still pop from queue * refactor: mv some protected member function to private * fix(actor): fix the condition about sending EORD msg * refactor(input_wise_actor): use RegstSlot in InputWiseActor * fix(copy_comm_net_actor): rename piece_id2regst_ctx to piece_id2regst_ctx_ * refactor: rename Name2RegstDescId to Name2RegstDescIds * refactor(naive_actor): "override final" instead of only "final" * refine(actor): little refine * feat: update the return type of GetNaiveOrCustomizedNamesRegstDescName to enum class RegstNameType Former-commit-id: e042befc
-
- 26 9月, 2018 2 次提交
-
-
由 Shiyuan Shang-Guan 提交于
* add lars set * add lars * override ibn&obn to lbi * make model update consistent * check cuda stream sync * add LARSUpdateModelGpu * checkout naive & momentum model update * use cublas::dot compute SumOfSquare * update lars for master * refine lars for master Former-commit-id: 9518970b
-
由 qq_22305325 提交于
* hinge_loss_kernel_test * fix opkernel_test * fix test file * optimize test file * opyimize opkernel test * complete opkernel test interface Former-commit-id: 7faf75a6
-
- 25 9月, 2018 2 次提交
-
-
由 Jinhui Yuan 提交于
* remove useless Copy in device_context * fix cyclic and copy_to_local bug in binary_in_stream_with_local_copy Former-commit-id: 4b2c4ef0
- 24 9月, 2018 1 次提交
-
-
由 Jinhui Yuan 提交于
* add nccl dependency * add nccl comm handle * nccl allreduce works * NcclAllreduce -> NcclAllReduce * fix header guard * add NcclReduceScatter, NcclAllGather * complete ReduceScatter and AllGather, (with cuda error) * change variable name * reduce-scatter, all-gather works * add NcclScatter and NcclGather work type * Dev use nccl add nccl comm manager (#1206) * add parallel_set_id * add nccl_comm_manager * log nccl comm create * use NcclCommMgr * bugfix * OF_DISALLOW_COPY_AND_MOVE * remove nccl_scatter_handle and nccl_gather_handle from DeviceCtx * remove nccl handles from cuda_stream_handle * nccl_util and GetNcclDataType * fix rank_num * fix rank_id fix rank_id * CudaCheck->NcclCheck * only GPU * PoorCompTaskNode SoleIn, SoleOut, SoleOp, SoleIbn, SoleObn * PoorCompTaskNode * reformat * format change * Dev use nccl merge reduce share mem (#1216) * add parallel_set_id * add nccl_comm_manager * log nccl comm create * use NcclCommMgr * bugfix * OF_DISALLOW_COPY_AND_MOVE * remove nccl_scatter_handle and nccl_gather_handle from DeviceCtx * remove nccl handles from cuda_stream_handle * nccl_util and GetNcclDataType * fix rank_num * fix rank_id fix rank_id * CudaCheck->NcclCheck * only GPU * PoorCompTaskNode SoleIn, SoleOut, SoleOp, SoleIbn, SoleObn * PoorCompTaskNode * reformat * ReduceGather * GlobalAdd * ReduceScatter * EnableIfNeed * ConcatSplit * EnableMemSharing for pred if need EnableMemSharing for pred if need * CtrlEdge for Gather * CtrlEdge for GlobalAdd * LocalAdd CtrlEdge * CollectReduceTaskNode * reverse nodes * local_add_mem_sharing local add mem sharing * global add mem sharing * reduce_mem_sharing * bugfix * refine * format change (remove empty lines) * format change * fix local_add and gather issues * Dev refactor reduce add (#1218) * change ReduceGlobalAdd to ReduceAdd * rm ReduceLocalAdd * no mem sharing case works * let ReduceAddCompActor decide whether it is local or global * multi machine multi gpus Nccl and Oneflow allreduce works * refine * extract SortEdges * make EdgeInfo protected * Dev use nccl refine (#1220) * const qualifier * PoorCompTaskNode=>PipeCompTaskNode * int=>int32_t * refine ReduceMemSharingCtx * NcclDeviceCtx and NcclActor NcclDeviceCtx and NcclActor * empty line * CudaDeviceCtx<-NcclDeviceCtx * fix wrong rank_id in reduce_add_actor (#1229) * fix wrong rank_id in reduce_add_actor * rm device_num_of_each_machine from parallel_ctx * fix reduce gather control edge (#1235) * fix reduce gather control edge * extract FindNearestReduceAddCompTaskNode * extract method ReduceCompTaskNodeIf::FindPredRduceTaskNodeIf * CHECK nearest_add_copy_d2h * Dev use nccl cross machine nccl all reduce (#1246) * support ncclAllReduce cross machine * fix rank_id and rank_num for mix * reformat * reformat * simplify nccl_kernel (#1256) * simplify REGISTER_BLD_SUB_TSK_GPH_MTHD (#1260) * simplify REGISTER_BLD_SUB_TSK_GPH_MTHD * note * Dev use nccl reduce ranking ctx (#1252) * reformat * compute rank_id and rank_num with FixCompTaskNode * reformat * fix rank_id for reduceadd * ReduceRankingCtx * New Ranking and MemSharing for Reduce * DECLARE_REDUCE_LOGICAL_NODE * Ranking4NcclAllReduce * fix ranking * remove AsTaskNode * reformat * runtime rank ctx * rank_set * bugfix * bugfix * unittest * change use_nccl_all_reduce_cross_machine to use_nccl_inter_node_communication * refine refine * move BuildCtrlRegstBetweenReduceCopyNodes to ReduceAddCompTaskNode * CHECK mem_size_ Former-commit-id: 55496813
-
- 23 9月, 2018 1 次提交
-
- 19 9月, 2018 2 次提交
-
-
由 Shiyuan Shang-Guan 提交于
Former-commit-id: 31693ec1