1. 10 1月, 2019 1 次提交
    • W
      [Feature] support mix precision training for resnet (#14899) · fd854183
      Wu Yi 提交于
      * clip softmax for fp16
      
      * updates
      
      * fuse xent support fp16 test=develop
      
      * wip
      
      * wip
      
      * add simple row reduce
      
      * wip fp16 accurate softmax
      
      * add accurate softmax kernel for fp16 test=develop
      
      * update test=develop
      
      * fix cpu build test=develop
      
      * update api.spec test=develop
      
      * follow comments test=develop
      
      * fix build test=develop
      
      * fix trt build test=develop
      
      * fix inference build test=develop
      
      * fix merge test=develop
      
      * update test=develop
      
      * try fix build test=develop
      
      * fix build test=develop
      
      * rename real_exp test=develop
      
      * fortest
      
      * remove hacky kernels test=develop
      
      * clean up test=develop
      fd854183
  2. 09 1月, 2019 1 次提交
  3. 27 12月, 2018 5 次提交
  4. 20 12月, 2018 2 次提交
  5. 14 12月, 2018 2 次提交
  6. 13 12月, 2018 2 次提交
  7. 10 12月, 2018 1 次提交
  8. 11 11月, 2018 1 次提交
  9. 08 11月, 2018 3 次提交
  10. 07 11月, 2018 3 次提交
  11. 29 10月, 2018 1 次提交
    • W
      [1.1] [project] train imagenet using large batch size (#13766) · 26200f2e
      Wu Yi 提交于
      * fix nccl2 lars dist support
      
      * put lars in momentum op
      
      * add tests lars
      
      * fix ci
      
      * fix cpu kernel
      
      * soft warning
      
      * remove lars in test_recognize_digits.py
      
      * move to another op
      
      * add file
      
      * update api.spec test=develop
      
      * update test=develop
      
      * fix api.spec test=develop
      
      * wip
      
      * wip, finish grad merge ops
      
      * wip, finish graph build
      
      * wip test running
      
      * work on 1 gpu
      
      * workable version
      
      * update
      
      * fix tests
      
      * fuse broadcast op
      
      * fix compile failed
      
      * refine
      
      * add batch merge test mnist
      
      * fix CI test=develop
      
      * fix build
      
      * use independent bn params for batch merge test=develop
      
      * update api.spec
      
      * follow comments and for test
      
      * wip
      
      * refine tests test=develop
      
      * follow comments test=develop
      
      * remove startup bn modify test=develop
      
      * follow comments test=develop
      
      * fix merge test=develop
      26200f2e
  12. 25 10月, 2018 1 次提交
  13. 18 10月, 2018 1 次提交
  14. 17 10月, 2018 1 次提交
  15. 15 10月, 2018 1 次提交
    • C
      Add check for opt op (#13840) · 8e2fdc54
      chengduo 提交于
      * add check for opt op
      
      * fix opt op
      test=develop
      
      * fix test fail
      test=develop
      
      * fix optimization doc
      test=develop
      
      * test=develop
      8e2fdc54
  16. 25 9月, 2018 1 次提交
  17. 20 9月, 2018 1 次提交
  18. 19 9月, 2018 1 次提交
  19. 18 9月, 2018 1 次提交
  20. 05 9月, 2018 1 次提交
    • Q
      Add centered mode rmsprop (#13161) · 6e03f790
      Qiao Longfei 提交于
      * rmsprop optimizer support v1 mode
      
      * typo
      
      * optimize code
      
      * refine code
      
      * optimize unit test
      
      * update test_rmsprop_op.py
      
      * update formula of rmsprop
      
      * optimize document
      
      * update API.spec for RMSPropOptimizer
      
      * add default value to check_output_with_place equal_nan
      6e03f790
  21. 29 8月, 2018 1 次提交
  22. 28 8月, 2018 1 次提交
  23. 15 8月, 2018 1 次提交
  24. 26 7月, 2018 2 次提交
  25. 20 7月, 2018 1 次提交
  26. 18 7月, 2018 1 次提交
  27. 17 7月, 2018 1 次提交
    • W
      Remove block api (#12107) · db67d60e
      Wu Yi 提交于
      * remove block api
      
      * remove clone_variable
      
      * hide block inner apis
      
      * update
      
      * fix tests
      db67d60e
  28. 13 7月, 2018 1 次提交
    • C
      Refine multi thread cpu parallel exe (#11406) · 86b0a725
      chengduo 提交于
      * refine multi-thread CPU Parallel exe
      
      * refine multi thread CPU Parallel exe
      
      * Refine CPU version for ParallelExecutor
      
      * add share_parameter_between_cards_
      
      * Fix ParallelExecutor bug
      
      * Fix unit test
      
      * Fix parameter opt balance
      
      * Fix with opti (param->grad)
      
      * Add grad to op var
      
      * Remove shard_param_between_cards
      86b0a725