- 26 4月, 2020 1 次提交
-
-
由 liuwei1031 提交于
* save InferVarType changes, test=develop * remove code comments, test=develop * tweak code, test=develop * fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop * modify fused_bn_activation_op, test=develop * fix error of fused_bn_activation_op, test=develop * fix PADDLE_ENFORCE and unittest coverage issue, test=develop * tweak PADDLE_ENFORCE messages, test=develop * improve unittest coverage, test=develop * add StaticGraphInferVarType class, test=develop * rebase develop branch, test=develop * fix unittest error, test=develop * remove comments, test=develop * improve unittest coverage, test=develop * imporve error message and imporve unittest coverage, test=develop * upgrade InferVarType API, test=develop * tweak pyfunc error message, test=develop * fix compilation conflict - save_combine_op, test=develop
-
- 28 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 26 2月, 2020 1 次提交
-
-
由 guofei 提交于
As the title
-
- 22 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add sync communicator and implement
-
- 10 2月, 2020 1 次提交
-
-
由 Wilber 提交于
Compile without nccl deps. [1/2] Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 17 1月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 16 1月, 2020 1 次提交
-
-
由 zhangchunle 提交于
-
- 13 1月, 2020 1 次提交
-
-
由 123malin 提交于
* test=develop, bug fix for sparse recorder
-
- 06 1月, 2020 1 次提交
-
-
由 123malin 提交于
* add distributed_strategy
-
- 25 12月, 2019 1 次提交
-
-
由 zhouwei25 提交于
-
- 15 12月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* rename paddle throw error macro, test=develop * fix new error use case, test=develop
-
- 12 12月, 2019 1 次提交
-
-
由 tangwei12 提交于
* add fake init for the trainer, fix large memory hold in the trainer * do not merge recv vars from a remote endpoint, test=develop * add recv and save op, merge slice var in one op, save memory * remove hsigmoid with pull sparse, test=develop
-
- 03 12月, 2019 2 次提交
- 29 11月, 2019 1 次提交
-
-
由 hong 提交于
* add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop
-
- 01 11月, 2019 1 次提交
-
-
由 123malin 提交于
* update pserver decay blocks * update distributed notify handler
-
- 31 10月, 2019 1 次提交
-
-
由 hong 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop
-
- 28 10月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop
-
- 15 10月, 2019 2 次提交
-
-
由 Chengmo 提交于
* test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug
-
由 123malin 提交于
* bug fix: invalid learning rate decay in pserver async mode
-
- 07 10月, 2019 1 次提交
-
-
由 tangwei12 提交于
Heartbeat for distributed async training.
-
- 30 9月, 2019 1 次提交
-
-
由 Chengmo 提交于
* refector geo sgd & communicator
-
- 27 9月, 2019 1 次提交
-
-
由 tangwei12 提交于
* add a base class for the Communicator * add AsyncCommunicator Impl for async distributed training
-
- 02 9月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 28 8月, 2019 1 次提交
-
-
由 tangwei12 提交于
* fix correctness of the communicator * fix a bug in send thread when sending var context is empty, test=develop * add lookup_table_prefetch_op and prefetch optimize, test=develop * remove remote prefetch GPU supported * word2vec force with CPU, test=develop * test dist remote lookup table force with CPU, test=develop
-
- 26 8月, 2019 1 次提交
-
-
由 tangwei12 提交于
* fix sync mode hang in transpiler * remove sync mode in send/recv * replace PADDLE_ENFORCE with PADDLE_ENFORCE_NE
-
- 19 8月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
add fl_listen_and_serv op for Federated_learning and fl_distribute_transpiler add this op to pserver program . This op just listen the endpoint and sum&scale.
-
- 18 8月, 2019 1 次提交
-
-
由 gongweibao 提交于
Unset unittests http_proxy env to avoid timeout.
-
- 14 8月, 2019 1 次提交
-
-
由 Leo Chen 提交于
* remove unused DefaultGradOpDescMaker in REGISTER_OPERATOR(), test=develop * remove SplitIdsOpGradMaker since it is buggy and not tested, update spec file, test=develop
-
- 27 6月, 2019 1 次提交
-
-
由 HaoRen 提交于
* fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O
-
- 29 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 27 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 24 5月, 2019 1 次提交
-
-
由 chengduo 提交于
* This PR adds broadcast for multi-process. And it could be used in dynamic graph to broadcast parameters.
-
- 23 5月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
Async exe support communicator
-
- 17 5月, 2019 1 次提交
-
-
由 Yan Xu 提交于
* add var grad hook test=develop
-
- 25 4月, 2019 1 次提交
-
-
由 Yan Xu 提交于
implement dygraph.parallel.DataParallel to hook reduce op.
-
- 15 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
test=develop
-
- 11 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 10 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-