1. 28 4月, 2019 2 次提交
    • Z
      Refine dropout gpu memory (#17095) · 28d69d71
      Zeng Jinle 提交于
      * refine_dropout_mem,test=develop
      
      * # This is a combination of 14 commits.
      # The first commit's message is:
      remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)
      
      # This is the 2nd commit message:
      
      Fleet unify distributed training (#16791)
      
      * implement distributed transpiler with fleet
      # This is the 3rd commit message:
      
      ParallelDyGraph with GPU collective mode (#16827)
      
      implement dygraph.parallel.DataParallel to hook reduce op.
      
      # This is the 4th commit message:
      
      Init mixed precision training interface (#16856)
      
      * Init mixed precision training interface
      
      * Add fp16 test script
      
      test=develop
      
      * All initializers support float16
      
      test=develop
      
      * Code cleanup & add more code annotations
      
      test=develop
      
      * Update API spec
      
      test=develop
      
      * Add usage example in doc
      
      test=develop
      
      # This is the 5th commit message:
      
      fix reference_count_pass,test=develop (#17060)
      
      test=develop
      # This is the 6th commit message:
      
      Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)
      
      * Cache the information of linear interpolation in forward and use it in backward.
      test=develop
      
      * Fix cuda kernel.
      test=develop
      
      # This is the 7th commit message:
      
      remove unnecessary prepare_data (#17080)
      
      test=develop
      # This is the 8th commit message:
      
      fix interpolate cu. test=develop (#17101)
      
      # This is the 9th commit message:
      
      test=develop, double backward leaky_relu (#17067)
      
      backward of backward: leaky_relu
      # This is the 10th commit message:
      
      fix fuse optimizer ops (#17102)
      
      test=develop
      # This is the 11th commit message:
      
      truncated_gaussian_random supported in distributed training, test=develop (#17091)
      
      # This is the 12th commit message:
      
       Detailed coordinate description for yolov3 loss (#17007)
      
      * Detailed coordinate description for yolov3 loss
      
      test=develop
      
      * modified api.spec
      
      test=develop
      
      * modified loss name
      
      * fix api.spec
      
      test=develop
      
      * polish description
      
      test=develop
      
      * modified api.spec
      
      test=develop
      
      # This is the 13th commit message:
      
      fix test_weight_decay (#17109)
      
      test=develop
      # This is the 14th commit message:
      
      Path flag (#17105)
      
      * fix python/paddle/fluid/__init__.py detecting problems
      28d69d71
    • H
      Use CudnnWorkspaceHandle in exhaustive search (#17082) · b9494058
      Huihuang Zheng 提交于
      1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn.
      2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search.
      
      test=develop
      b9494058
  2. 26 4月, 2019 3 次提交
  3. 25 4月, 2019 2 次提交
  4. 23 4月, 2019 2 次提交
    • Z
      Make conv cudnn workspace size configurable (#17036) · 0c335dcd
      Zeng Jinle 提交于
      * make_conv_cudnn_ws_size_configurable, test=develop
      
      * change std::max to std::min
      test=develop
      0c335dcd
    • Q
      Support backward of backward for Relu and add a new gradient checker by... · c1c2633a
      qingqing01 提交于
      Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. (#16862)
      
      * Support backward of backward and a new gradient checker
      * Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package.
      
      1. Add ReluDoubleGradMaker when register relu_grad.
      2. Add a new gradient checker by comparing theoretical and numerical Jacobian.  Check double gradients by double_grad_check.
      c1c2633a
  5. 22 4月, 2019 4 次提交
  6. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  7. 20 4月, 2019 1 次提交
  8. 19 4月, 2019 1 次提交
  9. 18 4月, 2019 1 次提交
  10. 17 4月, 2019 3 次提交
  11. 16 4月, 2019 20 次提交