- 15 11月, 2021 2 次提交
-
-
由 Zeng Jinle 提交于
* add split_program * make ut faster * increase ut timeout * make result deterministic * add fuse_all_reduce pass * add ut framework, update * fix ut framework * remove useless code * add coverage support * update * fix CI * fix some bugs and fix ci coverage * fix conflict
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop
-
- 11 11月, 2021 2 次提交
-
-
由 xiayanming 提交于
* fleet support elastic train * fleet support elastic train * support elastic * add unittest * fix unitest bug * fix unittest bug * fix unittest bug * fix unittest coverage * fix unittest coverage * fix unittest coverage * fix unittest coverage * fix unittest coverage * fix elastic bug * fix ci fail * fix ci fail * fix elastic bug * fix elastic bug * fix joint debugging bug * fix joint debugging bug * fix windows ci failed * fix windows ci failed
-
由 zmx 提交于
* change username * fix * fix * fix * fix * fix * update * update * update unittests * fix * update * fix * update * fix * fix * fix * update * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update send_and_recv op. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix unit. notest,test=coverage * fix ut. notest, test=coverage * update. notest,test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix. notest, test=coverage * fix. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * add func. notest, test=coverage * fix ut. notest, test=coverage * fix. test=develop * fix. test=develop
-
- 08 11月, 2021 1 次提交
-
-
由 kuizhiqing 提交于
-
- 28 10月, 2021 3 次提交
-
-
由 wangguanqun 提交于
* add trainer desc config to distributed strategy * code style modified * data_feed set lod * fix bug * code style * fix bug * save load * save load * save unittest * add unittest of the_one_ps * unittest * add todo in communicator sendsparse
-
由 seemingwang 提交于
-
由 Bo Liu 提交于
-
- 27 10月, 2021 1 次提交
-
-
由 xiongkun 提交于
* bugfix: only check backend when mode == Collecive * fix bug
-
- 25 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* fix bug of check_inf * fix allreduce
-
- 21 10月, 2021 2 次提交
-
-
由 danleifeng 提交于
-
由 xiongkun 提交于
-
- 20 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* fix bugs of ClipGradByGlobalNorm * add unittests * add unittests
-
- 19 10月, 2021 2 次提交
-
-
由 danleifeng 提交于
-
由 WangXi 提交于
-
- 18 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* [HybridParallel]Support fp16 in dygraph hybrid parallel * update * update * update for recompute * add unittest of pp+fp16 * add unittest of recompute+fp16 * update * modify ut
-
- 15 10月, 2021 1 次提交
-
-
由 duanboqiang 提交于
-
- 14 10月, 2021 3 次提交
-
-
由 duanboqiang 提交于
-
由 ShenLiang 提交于
* add no_sync for parameters sync * add pipeline for moe
-
由 Yuang Liu 提交于
-
- 13 10月, 2021 2 次提交
-
-
由 Guoxia Wang 提交于
-
由 Leo Chen 提交于
* refine amp level * fix typo * update tracer._amp_level
-
- 12 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* fix calling bug of HybridParallelClipGrad * fix bugs of HybridParallelClipGrad * add unittest of pp with HybridParallelClipGrad * fix bugs in mp_layers.py * update * fix bugs in pp_layers.py * update
-
- 11 10月, 2021 1 次提交
-
-
由 danleifeng 提交于
* heterps:add fuse_allreduce op; test=develop * add program_mode in minimize for pslib mode;test=develop
-
- 09 10月, 2021 1 次提交
-
-
由 zhaoyingli 提交于
* support ClipGradByGlobalNorm in sharding * support ClipGradByGlobalNorm in sharding * test=allcase
-
- 08 10月, 2021 1 次提交
-
-
由 yaoxuefeng 提交于
-
- 07 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer * update * update
-
- 30 9月, 2021 1 次提交
-
-
由 李季 提交于
* fix raw optim * pre-commit test file Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
- 29 9月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 28 9月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 24 9月, 2021 2 次提交
-
-
由 ShenLiang 提交于
-
由 seemingwang 提交于
* graph engine demo * upload unsaved changes * fix dependency error * fix shard_num problem * py client * remove lock and graph-type * add load direct graph * add load direct graph * add load direct graph * batch random_sample * batch_sample_k * fix num_nodes size * batch brpc * batch brpc * add test * add test * add load_nodes; change add_node function * change sample return type to pair * resolve conflict * resolved conflict * resolved conflict * separate server and client * merge pair type * fix * resolved conflict * fixed segment fault; high-level VLOG for load edges and load nodes * random_sample return 0 * rm useless loop * test:load edge * fix ret -1 * test: rm sample * rm sample * random_sample return future * random_sample return int * test fake node * fixed here * memory leak * remove test code * fix return problem * add common_graph_table * random sample node &test & change data-structure from linkedList to vector * add common_graph_table * sample with srand * add node_types * optimize nodes sample * recover test * random sample * destruct weighted sampler * GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * pybind sample nodes api * pull nodes with step * fixed pull_graph_list bug; add test for pull_graph_list by step * add graph table;name * add graph table;name * add pybind * add pybind * add FeatureNode * add FeatureNode * add FeatureNode Serialize * add FeatureNode Serialize * get_feat_node * avoid local rpc * fix get_node_feat * fix get_node_feat * remove log * get_node_feat return py:bytes * merge develop with graph_engine * fix threadpool.h head * fix * fix typo * resolve conflict * fix conflict * recover lost content * fix pybind of FeatureNode * recover cmake * recover tools * resolve conflict * resolve linking problem * code style * change test_server port * fix code problems * remove shard_num config * remove redundent threads * optimize start server * remove logs * fix code problems by reviewers' suggestions * move graph files into a folder * code style change * remove graph operations from base table * optimize get_feat function of graph engine * fix long long count problem * remove redandunt graph files * remove unused shell * recover dropout_op_pass.h * fix potential stack overflow when request number is too large & node add & node clear & node remove * when sample k is larger than neigbor num, return directly * using random seed generator of paddle to speed up * fix bug of random sample k * fix code style * fix code style * add remove graph to fleet_py.cc * fix blocking_queue problem * fix style * fix * recover capacity check * add remove graph node; add set_feature * add remove graph node; add set_feature * add remove graph node; add set_feature * add remove graph node; add set_feature * fix distributed op combining problems * optimize * remove logs Co-authored-by: NHuang Zhengjie <270018958@qq.com> Co-authored-by: NWeiyue Su <weiyue.su@gmail.com> Co-authored-by: Nsuweiyue <suweiyue@baidu.com> Co-authored-by: Nluobin06 <luobin06@baidu.com> Co-authored-by: Nliweibin02 <liweibin02@baidu.com> Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
-
- 18 9月, 2021 2 次提交
-
-
由 WangXi 提交于
-
由 Guoxia Wang 提交于
* fix bug
-
- 17 9月, 2021 3 次提交
-
-
由 zhangbo9674 提交于
* add pure fp16 major function in auto_cast & tracer * support master weight in dygraph for pure fp16 * check mix dtype of fp16&fp32 for check_finite_and_unscale op * change pure fp16 funtion name * refine some bug in auto_cast * refine auto_cast interface logic * add param _casted_by_pure_fp16 for class Layer * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator * refine pure_fp16_decorator as decorator * add unittest * add comment * add comment * support recompute * add comment for auto_cast and decorator * support to_static_state_dict for paddle.jit.save * unlimite models num and optimizers num * add lookup_table in black_list * fix momentum and layer state_dict * fix bug in layer state_dict * fix bug in layer state_dict_helper * refine unittest * refine test_momentun_op * refine interface and some code * refine amp_decorator interface * refine pure fp16 interface * refine master weight interface
-
由 Guoxia Wang 提交于
-
由 Guoxia Wang 提交于
* add launch doc
-
- 16 9月, 2021 2 次提交
- 15 9月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
-