- 10 11月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
* cherry-pick recompute doc update. * update.
-
- 01 11月, 2022 1 次提交
-
-
由 sneaxiy 提交于
-
- 29 10月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* reformat hybrid_parallel_util.py by black * add fused_allreduce_gradients_with_group * add scale * fix ci
-
- 27 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 22 9月, 2022 2 次提交
-
-
由 Roc 提交于
uniform logger manager in FleetAPI. hidde API under distributed/utils which users don't need.
-
由 Haohongxiang 提交于
* fix bugs of mp * fix bugs of mp * update * update * fix bug
-
- 19 9月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
-
- 16 8月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* reconstruct_of_fleet_api * update
-
- 09 8月, 2022 1 次提交
-
-
由 zhaocaibei123 提交于
* save load * save load * add unittest * first commit * second commit * third commit * remove SaveLocalFS in memory sparse table * save dense param * update * push slot * fix push show clk: int -> float * add unittest * fix sample * unittest * add AsExtra for op * unittest * modify fs.py * modify fs.py * fix some bugs * add dataset hdfs config * local change * dataset use differenct hadoop ugi/fs_name * add * fix conflict * fix * remove logs * code style * fix * code style * code style * fix * code style * save_dense_param * fix * fix * fix * fix * change momentum in dense optimzer * fix * fix * change fluid => paddle.static * remove some unuseful code Co-authored-by: Nesythan <esythan@126.com>
-
- 27 6月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* rename eagerpylayer
-
- 16 6月, 2022 1 次提交
-
-
由 gongweibao 提交于
-
- 14 6月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 07 6月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* fix bugs of reducer * update * update
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* use yapf to format all python file * yapf exclude two unittests file for they rely on writing and reading file, and format will break them * disable diff_py_file because too many diff files cause command following failed
-
- 31 5月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 23 5月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
-
- 16 5月, 2022 1 次提交
-
-
由 ShenLiang 提交于
* fix recompute in mp * fix recompute
-
- 16 4月, 2022 1 次提交
-
-
由 Baibaifan 提交于
-
- 15 4月, 2022 1 次提交
-
-
由 danleifeng 提交于
* add gpupsutil and afsclient; test=develop
-
- 06 4月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* remove unrequired ut cases * update * fix bugs * update
-
- 04 4月, 2022 1 次提交
-
-
由 ShenLiang 提交于
-
- 25 3月, 2022 1 次提交
-
-
由 Jiabin Yang 提交于
* refactor eager flags * fix flags error when we switch from eager to dygraph * fix ci problem * fix ci * fix ci * merge develop and fix code style * merge develop and fix code style * fix op test error * fix op test error * fix op test error * fix op test error * fix op test error * merge develop
-
- 23 3月, 2022 1 次提交
-
-
由 zhaocaibei123 提交于
* fix benchmark and communicator config * fix bugs of the_one_ps * multi program and fix bug in optimizer * multi program in the_one_ps * public commcontext * ps optimizer multi programs * cvm & datanorm backend * fix dim * fix unittest * fix * the one ps merge * remove comm * add DownpourLiteWorker * all * fix * fix * device worker downpour lite * fix * fix bug in global shuffle * save inference model * fix & add log * fix * remove log * fix * fix save summary * fix * fix pscore * fix * fix * fix * fix * fix * remove logs * fix * fix * fix * fix * fix * add some comments * fix Co-authored-by: Nesythan <esythan@126.com>
-
- 02 3月, 2022 1 次提交
-
-
由 Leo Chen 提交于
-
- 18 2月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* support dtype param for auto_cast * add amp_dtype for tracer * add unsupported bf16 list * support bf16 amp for O2 * refine python interface for bfloat16 * refine code * refine code * refine unittest * refine code * refine code * add bf16 o1 * refine code by comment * add gradient accumulator * add recompute
-
- 21 12月, 2021 2 次提交
-
-
由 Guoxia Wang 提交于
-
由 Haohongxiang 提交于
* update * fix bugs * modify code style * fix bugs of _get_global_group
-
- 09 12月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* merge latest develop branch * fix bugs * update * fix bugs for unittest * modify for less use of gpu mem * fix bugs of using _reset_grad_inplace_version * update * update * modify for CI-Coverage * retrick all CIs
-
- 29 11月, 2021 2 次提交
-
-
由 Baibaifan 提交于
-
由 李季 提交于
Co-authored-by: NChen Long <1300851984@qq.com>
-
- 25 11月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 25 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* fix bug of check_inf * fix allreduce
-
- 21 10月, 2021 1 次提交
-
-
由 danleifeng 提交于
-
- 18 10月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* [HybridParallel]Support fp16 in dygraph hybrid parallel * update * update * update for recompute * add unittest of pp+fp16 * add unittest of recompute+fp16 * update * modify ut
-
- 13 10月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* refine amp level * fix typo * update tracer._amp_level
-
- 11 10月, 2021 1 次提交
-
-
由 danleifeng 提交于
* heterps:add fuse_allreduce op; test=develop * add program_mode in minimize for pslib mode;test=develop
-
- 08 10月, 2021 1 次提交
-
-
由 yaoxuefeng 提交于
-
- 24 9月, 2021 1 次提交
-
-
由 seemingwang 提交于
* graph engine demo * upload unsaved changes * fix dependency error * fix shard_num problem * py client * remove lock and graph-type * add load direct graph * add load direct graph * add load direct graph * batch random_sample * batch_sample_k * fix num_nodes size * batch brpc * batch brpc * add test * add test * add load_nodes; change add_node function * change sample return type to pair * resolve conflict * resolved conflict * resolved conflict * separate server and client * merge pair type * fix * resolved conflict * fixed segment fault; high-level VLOG for load edges and load nodes * random_sample return 0 * rm useless loop * test:load edge * fix ret -1 * test: rm sample * rm sample * random_sample return future * random_sample return int * test fake node * fixed here * memory leak * remove test code * fix return problem * add common_graph_table * random sample node &test & change data-structure from linkedList to vector * add common_graph_table * sample with srand * add node_types * optimize nodes sample * recover test * random sample * destruct weighted sampler * GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * pybind sample nodes api * pull nodes with step * fixed pull_graph_list bug; add test for pull_graph_list by step * add graph table;name * add graph table;name * add pybind * add pybind * add FeatureNode * add FeatureNode * add FeatureNode Serialize * add FeatureNode Serialize * get_feat_node * avoid local rpc * fix get_node_feat * fix get_node_feat * remove log * get_node_feat return py:bytes * merge develop with graph_engine * fix threadpool.h head * fix * fix typo * resolve conflict * fix conflict * recover lost content * fix pybind of FeatureNode * recover cmake * recover tools * resolve conflict * resolve linking problem * code style * change test_server port * fix code problems * remove shard_num config * remove redundent threads * optimize start server * remove logs * fix code problems by reviewers' suggestions * move graph files into a folder * code style change * remove graph operations from base table * optimize get_feat function of graph engine * fix long long count problem * remove redandunt graph files * remove unused shell * recover dropout_op_pass.h * fix potential stack overflow when request number is too large & node add & node clear & node remove * when sample k is larger than neigbor num, return directly * using random seed generator of paddle to speed up * fix bug of random sample k * fix code style * fix code style * add remove graph to fleet_py.cc * fix blocking_queue problem * fix style * fix * recover capacity check * add remove graph node; add set_feature * add remove graph node; add set_feature * add remove graph node; add set_feature * add remove graph node; add set_feature * fix distributed op combining problems * optimize * remove logs Co-authored-by: NHuang Zhengjie <270018958@qq.com> Co-authored-by: NWeiyue Su <weiyue.su@gmail.com> Co-authored-by: Nsuweiyue <suweiyue@baidu.com> Co-authored-by: Nluobin06 <luobin06@baidu.com> Co-authored-by: Nliweibin02 <liweibin02@baidu.com> Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
-
- 17 9月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
* add pure fp16 major function in auto_cast & tracer * support master weight in dygraph for pure fp16 * check mix dtype of fp16&fp32 for check_finite_and_unscale op * change pure fp16 funtion name * refine some bug in auto_cast * refine auto_cast interface logic * add param _casted_by_pure_fp16 for class Layer * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator * refine pure_fp16_decorator as decorator * add unittest * add comment * add comment * support recompute * add comment for auto_cast and decorator * support to_static_state_dict for paddle.jit.save * unlimite models num and optimizers num * add lookup_table in black_list * fix momentum and layer state_dict * fix bug in layer state_dict * fix bug in layer state_dict_helper * refine unittest * refine test_momentun_op * refine interface and some code * refine amp_decorator interface * refine pure fp16 interface * refine master weight interface
-
- 15 9月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
-