1. 09 8月, 2019 2 次提交
  2. 08 8月, 2019 2 次提交
    • J
      fix QueueDataset queue size (#19016) · fc038da7
      jiaqi 提交于
      * fix QueueDataset queue size,set queue size = batch size * 100, to avoid too many instances in channel when training is much slower than reading data.
      fc038da7
    • L
      Fix memory overwriting of tensors returned by executor (#19030) · 8f537354
      Leo Chen 提交于
      * fix memory overlapping of fetch var (return of executor.run), test=develop
      
      * fix wrong usage of ParallelExecutor in op_test, test=develop
      
      * remove useless parameter and simplify code
      
      * avoid tensor destruct untimely, test=develop
      
      * add testcase independent of OpTest, test=develop
      8f537354
  3. 06 8月, 2019 1 次提交
  4. 02 8月, 2019 4 次提交
    • Z
      Open gc by default (#18836) · 7ac748ad
      Zeng Jinle 提交于
      * open gc by default, test=develop
      
      * fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop
      
      * fix conditional_block op eager deletion bug, test=develop
      
      * add some comments to reviewers, test=develop
      7ac748ad
    • J
      support filelist size < trainer num && fix pull dense (#18956) · 02c370c3
      jiaqi 提交于
      * support filelist size < trainer num
      * pull dense when stop, to make sure local dense params are same as pserver, so save paddle model will save dense model same as pserver
      *  enable QueueDataset train same filelist for serveral times
      02c370c3
    • C
      Disable fuse optimization option (#18924) · e7da0940
      chengduo 提交于
      * Disable fuse optimization
      test=develop
      e7da0940
    • Fusion: seqpool_cvm_concat (#18471) · ee2f296e
      石晓伟 提交于
      * add fusion_seqpool_cvm_concat test=develop
      
      * simplify pass, test=develop
      
      * fix code style, test=develop
      ee2f296e
  5. 01 8月, 2019 1 次提交
  6. 30 7月, 2019 1 次提交
  7. 29 7月, 2019 2 次提交
    • Z
      Remove legacy C++ memory optimization codes (#18834) · 8008ab4e
      Zeng Jinle 提交于
      * remove legacy memory optimization codes, test=develop
      
      * follow huihuang's comments,test=develop
      
      * follow luotao's comments, test=develop
      8008ab4e
    • T
      add clear_model interface in fleetwrapper (#18815) · 52c1431e
      Thunderbrook 提交于
      * dump slot
      
      * test
      
      * proto
      
      * dump slot
      
      * test
      
      * proto
      
      * code style
      
      * code style
      
      * code style
      
      * style
      
      * add delete after unseen days
      
      * add unseen days
      
      * code style
      
      * conflict solve
      test=develop
      
      * add clear model
      
      * code style
      test=develop
      
      * code style
      test=develop
      52c1431e
  8. 27 7月, 2019 1 次提交
  9. 26 7月, 2019 1 次提交
    • Z
      Feature/mem opt pass refactor (#18735) · a802da65
      Zeng Jinle 提交于
      * first version memory optimize pass, test=develop
      
      * remove move_tensor_sharing_pass, test=develop
      
      * refine code comments, add unittests, test=develop
      
      * turn off memory_optimize by default, test=develop
      
      * follow huihuang's comments, test=develop
      
      * follow chengduoZH's comments, test=develop
      
      * fix grammar error, add const qualifier, fix pass_test exception message, test=develop
      
      * follow chengduoZH's comments 2nd, test=develop
      a802da65
  10. 25 7月, 2019 1 次提交
  11. 24 7月, 2019 2 次提交
    • Z
      Update trt5 for paddle-trt (#18645) · 26ae6d49
      Zhaolong Xing 提交于
      * update paddle-trt for:
          1. fix bug: when batch > 2, core in split plugin.
          2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
          3. add new attr to dropout.
          4. shuffle channel, swish, relu6 support
          test=develop
      
      * 1. fix ci
      test=develop
      26ae6d49
    • T
      add slot to sparse table (#18686) · d8396281
      Thunderbrook 提交于
      The change includes 2 things:
      
      1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
      2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
      test=develop
      d8396281
  12. 23 7月, 2019 2 次提交
    • J
      support patch data, add load_one_table, fix bug (#18509) · d18aabb4
      jiaqi 提交于
      (1)support patch data (merge slots of instances of same line id, modify dense layer which
      changes its size)
      (2)add fleet load_one_table interface, support load from paddle model and load from pslib model
      (3)fix push sparse bug which cause push sparse cost more time(about 10% in my testcase)
      (4)when some slots are not in one of your network (join/update, etc.),data feed、collect label info、push/pull sparse will skip these slots, instead of throw error.
      (5)add more debug info in TrainFilesWithProfiler
      d18aabb4
    • C
      Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
      chengduo 提交于
      * support sparse gradients
      test=develop
      fd3aad6c
  13. 19 7月, 2019 1 次提交
    • H
      Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8
      Huihuang Zheng 提交于
      Test PaddingRNN on V100 GPU device.
      
      Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.
                         
      GPU memory (MiB):   6414 (this PR)     vs   6837 (without this PR)
      Speed (steps/s):         10.28 (this PR)    vs    9.89 (without this PR)
       
      89bc3fd8
  14. 18 7月, 2019 1 次提交
    • Z
      Feature/auto_growth_allocator (#18561) · ae58afc5
      Zeng Jinle 提交于
      * feature/auto_growth_allocator, test=develop
      
      * add unittest of AlignedAllocator, test=develop
      
      * try to turn on auto_growth to test on CI, test=develop
      
      * fix segmentation fault in mixed_vector.h, test=develop
      
      * add unittests, test=develop
      ae58afc5
  15. 17 7月, 2019 1 次提交
  16. 16 7月, 2019 1 次提交
  17. 12 7月, 2019 2 次提交
  18. 11 7月, 2019 2 次提交
    • G
    • Z
      Feature/buffer_shared_inplace (#17911) · d3003a16
      Zeng Jinle 提交于
      * feature/buffer_shared_inplace, test=develop
      
      * refine code, test=develop
      
      * fix elementwise_add op cpu inplace and sum inplace bug, test=develop
      
      * add unittest and debug log, test=develop
      
      * fix parallel_executor scope bug, polish code, test=develop
      
      * fix sum op, activation op, single_in_place_inference bug, test=develop
      
      * remove kLocalExecScopeName, test=develop
      
      * fix unittest,test=develop
      
      * fix out_var first version bug, test=develop
      
      * follow comments,test=develop
      d3003a16
  19. 10 7月, 2019 1 次提交
  20. 09 7月, 2019 1 次提交
  21. 08 7月, 2019 3 次提交
  22. 04 7月, 2019 1 次提交
  23. 03 7月, 2019 2 次提交
  24. 02 7月, 2019 1 次提交
    • Y
      supports collective training with programs (#18392) · a873fa84
      Yi Liu 提交于
      1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops
      2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext
      3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis
      a873fa84
  25. 01 7月, 2019 1 次提交
    • M
      Fix Pooling output scale (#18186) · 7023a86c
      Michał Gallus 提交于
      * Int8: Fix Pooling output scale
      
      test=develop
      
      * Update scales quantization for certain operators
      
      These include: concat, transpose, pool and reshape. test=develop
      
      * Move concat minimum scale finding to quantizer
      
      test=develop
      7023a86c
  26. 29 6月, 2019 1 次提交
    • J
      fix data feed ptr error (#18419) · 93a2b317
      jiaqi 提交于
      fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.
      93a2b317
  27. 27 6月, 2019 1 次提交