1. 05 2月, 2020 2 次提交
  2. 04 2月, 2020 1 次提交
  3. 20 1月, 2020 1 次提交
  4. 06 1月, 2020 1 次提交
  5. 31 12月, 2019 1 次提交
  6. 05 12月, 2019 1 次提交
  7. 28 11月, 2019 1 次提交
  8. 26 11月, 2019 1 次提交
  9. 25 11月, 2019 1 次提交
  10. 21 11月, 2019 3 次提交
    • X
      fix fs_client_param bug (#21212) · 319d2ba9
      xujiaqi01 提交于
      * fix fs_client_param bug, user can set this config through fleet_desc_file or fleet config
      * test=develop
      319d2ba9
    • T
      solve pslib core in stop worker (#21263) · 0d17c1b8
      Thunderbrook 提交于
      * general table
      
      * add sparse table
      test=develop
      
      * no cvm
      test=develop
      
      * add no_cvm
      test=develop
      
      * add note
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * add key of optimizer
      test=develop
      
      * solve pslib stop core
      test=develop
      
      * barrier
      test=develop
      
      * add notes
      test=develop
      0d17c1b8
    • X
      fix fleet util bug (#21254) · eca66f31
      xujiaqi01 提交于
      * fix fleet util bug in save paddle inference model
      * test=develop
      eca66f31
  11. 20 11月, 2019 2 次提交
  12. 15 11月, 2019 2 次提交
  13. 12 11月, 2019 1 次提交
  14. 04 11月, 2019 1 次提交
  15. 31 10月, 2019 3 次提交
  16. 25 10月, 2019 1 次提交
    • X
      fix several sparse table issuses (#20686) · 48669aa8
      xujiaqi01 提交于
      * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto.
      * add find_distributed_lookup_table_grads instead of hard code GRAD
      * support embedding stop gradient. push sparse has error before fix this.* 
      * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this.
      * fix pull sparse, skip slots which do not have embedding.
      * fix collect feasign label info, skip slots which do not have embedding.
      * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables.
      * test=develop
      48669aa8
  17. 18 10月, 2019 1 次提交
  18. 15 10月, 2019 3 次提交
  19. 14 10月, 2019 1 次提交
  20. 12 10月, 2019 1 次提交
  21. 11 10月, 2019 1 次提交
  22. 07 10月, 2019 1 次提交
  23. 30 9月, 2019 1 次提交
  24. 24 9月, 2019 1 次提交
    • X
      support change shuffle and train thread num (#19841) · cedc0477
      xujiaqi01 提交于
      * support change shuffle thread num
      * support change train thread num
      * fix receive shuffle data of each channel
      * data norm stop gradient
      * add check thread_tensor type and root_tensor type when merge metric
      * remove sleep in shuffle, add config
      * add config of pslib client to client communication
      * fix xbox str
      * add data norm op testcase
      * add flush in trainer finalize
      cedc0477
  25. 23 9月, 2019 2 次提交
    • M
      Forward recompute3 (#19913) · 9901f696
      mapingshuo 提交于
      * add recompute based checkpoints methods for large batch training
      test=develop
      
      * add append_backward_with_forward_recomputation
      test=develop
      
      * refine optimizer
      test=develop
      
      * update backward and optimizer
      test=develop
      
      * make Variable usable
      test=develop
      
      * add recompute code
      
      * refine optimizer
      test=develop
      
      * refine addup _append_backward_ops_with_checkpoints_
      1) for recompute part, just cache the grad_op_desc without appending to block
      2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch
      test=develop
      
      * make method private
      
      * add recompute strategy into DistributedStrategy
      test=develop
      
      * checkpoint version3
      test=develop
      
      * remove some print information
      test=develop
      
      * remove unused sumop
      test=develop
      
      * try to fix recompute with graph building modules
      
      * add input names to vars should be held
      
      * add memory debug tool
      
      * backup backward
      
      * Fix bugs
      
      * add backward desc for op not in any segments
      
      * add exception info for sub_block
      
      test=develop
      
      * modify code style
      
      test=develop
      
      * modify code style
      
      test=develop
      
      * remove print functions
      
      test=develop
      
      * add API spec
      
      test=develop
      test=document_preview
      
      * make Recompute a child class of Optimizer
      
      test=develop
      test=document_preview
      
      * add API spec
      
      test=develop
      test=document_preview
      
      * modify API spec
      
      test=develop
      test=document_preview
      
      * add document for Recompute
      
      test=develop
      test=document_preview
      
      * change API doc of Rcompute
      
      test=develop
      test=document_preview
      
      * code cleaning
      
      test=develop
      test=document_preview
      
      * modify API spec
      
      * fix bugs when segments hold no element
      
      * add testcase for Recompute Optimizer
      
      test=develop
      test=document_preview
      
      * add test for apply_gradient, and code cleaning
      
      test=develop
      test=document_preview
      
      * add test case for load function
      
      * enable CI
      
      test=develop
      test=document
      
      * add test case
      
      test=develop
      test=document_preview
      
      * add sample code for 4 function of recompute optimizer
      
      test=develop
      test=document_preview
      9901f696
    • T
      paddle cloud role maker fix (#19646) · 278dd003
      tangwei12 提交于
      * optimize cloud rolemaker, test=develop
      278dd003
  26. 19 9月, 2019 1 次提交
  27. 17 9月, 2019 1 次提交
  28. 10 9月, 2019 1 次提交
  29. 06 9月, 2019 1 次提交
  30. 05 9月, 2019 1 次提交