1. 28 4月, 2019 2 次提交
    • Z
      Refine dropout gpu memory (#17095) · 28d69d71
      Zeng Jinle 提交于
      * refine_dropout_mem,test=develop
      
      * # This is a combination of 14 commits.
      # The first commit's message is:
      remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)
      
      # This is the 2nd commit message:
      
      Fleet unify distributed training (#16791)
      
      * implement distributed transpiler with fleet
      # This is the 3rd commit message:
      
      ParallelDyGraph with GPU collective mode (#16827)
      
      implement dygraph.parallel.DataParallel to hook reduce op.
      
      # This is the 4th commit message:
      
      Init mixed precision training interface (#16856)
      
      * Init mixed precision training interface
      
      * Add fp16 test script
      
      test=develop
      
      * All initializers support float16
      
      test=develop
      
      * Code cleanup & add more code annotations
      
      test=develop
      
      * Update API spec
      
      test=develop
      
      * Add usage example in doc
      
      test=develop
      
      # This is the 5th commit message:
      
      fix reference_count_pass,test=develop (#17060)
      
      test=develop
      # This is the 6th commit message:
      
      Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)
      
      * Cache the information of linear interpolation in forward and use it in backward.
      test=develop
      
      * Fix cuda kernel.
      test=develop
      
      # This is the 7th commit message:
      
      remove unnecessary prepare_data (#17080)
      
      test=develop
      # This is the 8th commit message:
      
      fix interpolate cu. test=develop (#17101)
      
      # This is the 9th commit message:
      
      test=develop, double backward leaky_relu (#17067)
      
      backward of backward: leaky_relu
      # This is the 10th commit message:
      
      fix fuse optimizer ops (#17102)
      
      test=develop
      # This is the 11th commit message:
      
      truncated_gaussian_random supported in distributed training, test=develop (#17091)
      
      # This is the 12th commit message:
      
       Detailed coordinate description for yolov3 loss (#17007)
      
      * Detailed coordinate description for yolov3 loss
      
      test=develop
      
      * modified api.spec
      
      test=develop
      
      * modified loss name
      
      * fix api.spec
      
      test=develop
      
      * polish description
      
      test=develop
      
      * modified api.spec
      
      test=develop
      
      # This is the 13th commit message:
      
      fix test_weight_decay (#17109)
      
      test=develop
      # This is the 14th commit message:
      
      Path flag (#17105)
      
      * fix python/paddle/fluid/__init__.py detecting problems
      28d69d71
    • H
      Use CudnnWorkspaceHandle in exhaustive search (#17082) · b9494058
      Huihuang Zheng 提交于
      1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn.
      2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search.
      
      test=develop
      b9494058
  2. 26 4月, 2019 5 次提交
  3. 25 4月, 2019 4 次提交
  4. 24 4月, 2019 1 次提交
  5. 23 4月, 2019 5 次提交
  6. 22 4月, 2019 8 次提交
  7. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  8. 20 4月, 2019 1 次提交
  9. 19 4月, 2019 2 次提交
  10. 18 4月, 2019 2 次提交
  11. 17 4月, 2019 7 次提交
  12. 16 4月, 2019 2 次提交