- 16 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
pretty print for communicator flag
-
- 12 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add thread barrier for the compiled program
-
- 11 2月, 2020 1 次提交
-
-
由 yaoxuefeng 提交于
* update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop
-
- 05 2月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add hdfs ls retry time and sleep time, fix save inference * test=develop
-
- 03 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* fix bug with half communicator
-
- 02 2月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add GeneralRoleMaker which is for general usage * test=develop
-
- 17 1月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 06 1月, 2020 1 次提交
-
-
由 123malin 提交于
* add distributed_strategy
-
- 31 12月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 05 12月, 2019 1 次提交
-
-
由 lilong12 提交于
-
- 28 11月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* fix fleet save bug of save_infernece_model * test=develop
-
- 26 11月, 2019 1 次提交
-
-
由 Zhen Wang 提交于
* fix some typos in AMP. test=develop * delete useless codes. test=develop
-
- 25 11月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* print table stat test=develop * notes test=develop * notes test=develop
-
- 21 11月, 2019 3 次提交
-
-
由 xujiaqi01 提交于
* fix fs_client_param bug, user can set this config through fleet_desc_file or fleet config * test=develop
-
由 Thunderbrook 提交于
* general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop
-
由 xujiaqi01 提交于
* fix fleet util bug in save paddle inference model * test=develop
-
- 20 11月, 2019 2 次提交
-
-
由 Thunderbrook 提交于
* general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop
-
由 Dong Daxiang 提交于
test=develop
-
- 15 11月, 2019 2 次提交
-
-
由 xujiaqi01 提交于
* fix cache table bug * add save_paddle_inference_model * fix hdfs util bug * test=develop
-
由 xujiaqi01 提交于
* copy some feasigns and corresponding embeddings from one sparse table to another * copy all feasigns and corresponding embeddings from one sparse table to another * copy all dense params from one table to another * copy some local vars to other local vars
-
- 12 11月, 2019 1 次提交
-
-
由 lilong12 提交于
modify the implementation of save_persistables and save_inference_model for fleet collective mode (#20802) * modify the implementation of save_persistables and save_inference_model functions for fleet collective, test=develop * add ut, test=develop
-
- 04 11月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
test=develop
-
- 31 10月, 2019 3 次提交
-
-
由 Chengmo 提交于
* fix PaddleCloud Role maker & add warning in distribute transpiler & change rpc_retry_times
-
由 Bai Yifan 提交于
-
由 Thunderbrook 提交于
* support dump param to afs test=develop * code style test=develop * code style test=develop * dump param test=develop * dump param test=develop * dump param test=develop * dump param test=develop
-
- 25 10月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto. * add find_distributed_lookup_table_grads instead of hard code GRAD * support embedding stop gradient. push sparse has error before fix this.* * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this. * fix pull sparse, skip slots which do not have embedding. * fix collect feasign label info, skip slots which do not have embedding. * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables. * test=develop
-
- 18 10月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* add check nan / inf in downpour worker during training * test=develop
-
- 15 10月, 2019 3 次提交
-
-
由 Chengmo 提交于
* test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug
-
由 WangXi 提交于
-
由 mapingshuo 提交于
* special case: strategy is None
-
- 14 10月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* support dump multi file test=develop * dump fix num file test=develop
-
- 12 10月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
-
- 11 10月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
* fix fc sort . test=develop
-
- 07 10月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
-
- 30 9月, 2019 1 次提交
-
-
由 Chengmo 提交于
* refector geo sgd & communicator
-
- 24 9月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize
-
- 23 9月, 2019 2 次提交
-
-
由 mapingshuo 提交于
* add recompute based checkpoints methods for large batch training test=develop * add append_backward_with_forward_recomputation test=develop * refine optimizer test=develop * update backward and optimizer test=develop * make Variable usable test=develop * add recompute code * refine optimizer test=develop * refine addup _append_backward_ops_with_checkpoints_ 1) for recompute part, just cache the grad_op_desc without appending to block 2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch test=develop * make method private * add recompute strategy into DistributedStrategy test=develop * checkpoint version3 test=develop * remove some print information test=develop * remove unused sumop test=develop * try to fix recompute with graph building modules * add input names to vars should be held * add memory debug tool * backup backward * Fix bugs * add backward desc for op not in any segments * add exception info for sub_block test=develop * modify code style test=develop * modify code style test=develop * remove print functions test=develop * add API spec test=develop test=document_preview * make Recompute a child class of Optimizer test=develop test=document_preview * add API spec test=develop test=document_preview * modify API spec test=develop test=document_preview * add document for Recompute test=develop test=document_preview * change API doc of Rcompute test=develop test=document_preview * code cleaning test=develop test=document_preview * modify API spec * fix bugs when segments hold no element * add testcase for Recompute Optimizer test=develop test=document_preview * add test for apply_gradient, and code cleaning test=develop test=document_preview * add test case for load function * enable CI test=develop test=document * add test case test=develop test=document_preview * add sample code for 4 function of recompute optimizer test=develop test=document_preview
-
由 tangwei12 提交于
* optimize cloud rolemaker, test=develop
-
- 19 9月, 2019 1 次提交
-
-
由 gongweibao 提交于
change _origin_program test=develop
-
- 17 9月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug
-