1. 29 8月, 2019 1 次提交
    • T
      Distributed training cherry-pick for Release 1.5 (#19486) · 416922e2
      tangwei12 提交于
      * fix bug in Class MultiSlotDataGenerator's function _gen_str, test=develop (#18222)
      * fix some bug when merge sparse embedding parameters, test=develop (#18223)
      * fix communicator with pyreader (#18350)
      * delete AllocatorFacade destructor  (#18606)
      * fix distribute transpiler GRPC error code 4, RPC Deadline (#18984)
      * merge pr #18441
      416922e2
  2. 19 6月, 2019 1 次提交
    • T
      Release/1.5 cherry pick (#18139) · 598addf1
      tangwei12 提交于
      * fix save/load in fleet (#17675)
      
      * fix save/load in Fleet
      * add UT framework of Fleet (#18058)
      
      * add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121)
      
      add paddle cloud role maker for specific cloud usage. This pr will simplifies user's configuration in distributed training.
      
      * assign role_maker before use (#18137)
      598addf1
  3. 29 10月, 2018 1 次提交
    • W
      [1.1] [project] train imagenet using large batch size (#13766) · 26200f2e
      Wu Yi 提交于
      * fix nccl2 lars dist support
      
      * put lars in momentum op
      
      * add tests lars
      
      * fix ci
      
      * fix cpu kernel
      
      * soft warning
      
      * remove lars in test_recognize_digits.py
      
      * move to another op
      
      * add file
      
      * update api.spec test=develop
      
      * update test=develop
      
      * fix api.spec test=develop
      
      * wip
      
      * wip, finish grad merge ops
      
      * wip, finish graph build
      
      * wip test running
      
      * work on 1 gpu
      
      * workable version
      
      * update
      
      * fix tests
      
      * fuse broadcast op
      
      * fix compile failed
      
      * refine
      
      * add batch merge test mnist
      
      * fix CI test=develop
      
      * fix build
      
      * use independent bn params for batch merge test=develop
      
      * update api.spec
      
      * follow comments and for test
      
      * wip
      
      * refine tests test=develop
      
      * follow comments test=develop
      
      * remove startup bn modify test=develop
      
      * follow comments test=develop
      
      * fix merge test=develop
      26200f2e