1. 12 6月, 2019 1 次提交
    • T
      combine noavx and avx package (#17889) · 5c06bff2
      tensor-tang 提交于
      * support avx and noavx core
      
      * add catch and give some log
      
      test=develop
      
      * fix build
      
      test=develop
      
      * add missing package
      
      test=develop
      
      * fix pybind name
      
      test=develop
      
      * fix import error
      
      test=develop
      
      * conbime noavx core
      
      test=develop
      
      * add requirements
      
      test=develop
      
      * fix unkown message
      
      test=develop
      
      * fix api spec
      
      test=develop
      
      * refine and clean
      
      test=develop
      
      * update
      
      * pass dist ut
      
      * follow comments
      
      test=develop
      
      * refine scripts
      
      test=develop
      5c06bff2
  2. 10 6月, 2019 1 次提交
  3. 06 6月, 2019 4 次提交
    • G
      fbbdc9cc
    • W
      Make ParallelExecutor support Windows GPU (#17787) · 453a49b1
      wopeizl 提交于
      * fix the ParallelExecutor on Windows
      test=develop
      * restrict to use one GPU only under windows
      453a49b1
    • INT8 MKL-DNN v2 integrate to slim (#17634) · 993c703b
      翟飞跃 提交于
      * refactor PR 16865
      
      * delete mergetool files
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * create dir for int8 model before call SaveOptimModel
      
      * test=develop
      
      * mkldnn int8 only support linux; test=develop
      
      * refine code; test=develop
      
      * remove comment; test=develop
      
      * refine code; test=develop
      
      * fix bug; test=develop
      
      * add exception for mkldnn_post_training_strategy
      
      * reuse int8v2 CAPI dataset; test=develop
      
      * fix accuracy check bug; test=develop
      
      * remove tab
      
      * convert files to unix format
      
      * test=develop
      
      * reduce CI time;test=develop
      
      * reduce CI time and refine code;test=develop
      
      * refine comment; test=develop
      
      * add cmake FLAGS;test=develop
      
      * remove predict_num;test=develop
      993c703b
    • W
      use pyreader to read data in dygraph mode (#17314) · 841553e1
      wopeizl 提交于
      * use pyreader to read data
      
      * add return_list to PyReader to support return value represented as list
      841553e1
  4. 05 6月, 2019 1 次提交
  5. 04 6月, 2019 1 次提交
  6. 31 5月, 2019 1 次提交
  7. 27 5月, 2019 2 次提交
  8. 25 5月, 2019 1 次提交
    • Z
      TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc
      Zhaolong Xing 提交于
      * fluid int8 train and trt int8 predict align.
      trt int8 predict init
      op converter
      
      * 2. align fluid int8 train and trt int8 inference.
      enhance quant dequant fuse pass
      enhance op converter, trt engine, trt engine op, trt subgraph pass.
      
      * 3. add delete_quant_dequant_pass for trt
      
      test=develop
      
      * 4. add the missing file
      test=develop
      
      * 5. i modify the c++ interface, but forget to modify the pybind code
      fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
      test=develop
      61221ebc
  9. 24 5月, 2019 4 次提交
  10. 23 5月, 2019 3 次提交
  11. 20 5月, 2019 1 次提交
  12. 17 5月, 2019 2 次提交
    • Y
      polish parallel dygraph code (#17164) · 02175555
      Yan Xu 提交于
      * add var grad hook test=develop
      02175555
    • J
      Fix/Fix memory leak in dygraph (#17394) · d7df4e5e
      Jiabin Yang 提交于
      * test=develop, add gradient sort backward strategy
      
      * test=develop, fix test by add FLAGS_cudnn_deterministic on new tests
      
      * test=develop, fix memory leak in dygraph mode
      
      * test=develop, fix memory leak in dygraph mode
      
      * test=develop, polish code
      
      * test=develop, polish code
      
      * test=develop, polish code
      d7df4e5e
  13. 16 5月, 2019 1 次提交
  14. 15 5月, 2019 1 次提交
    • J
      add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118) · 66d51206
      jiaqi 提交于
      * add save/load model, shrink table, cvm, config file & fix pull dense bug
      test=develop
      
      * fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error
      add client flush, add get data size
      test=develop
      
      * fix global shuffle bug
      test=develop
      
      * fix global shuffle bug
      test=develop
      
      * fix code style
      test=develop
      
      * fix code style & modify pslib cmake
      test=develop
      
      * fix error of _role_maker
      test=develop
      
      * fix code style
      test=develop
      
      * fix code style
      test=develop
      
      * fix code style
      test=develop
      
      * fix code style
      test=develop
      
      * fix code style
      test=develop
      
      * fix windows compile error of fleet
      test=develop
      
      * fix global shuffle bug
      
      * add comment
      test=develop
      
      * update pslib.cmake
      test=develop
      
      * fix fill sparse bug
      test=develop
      
      * fix push sparse bug
      test=develop
      66d51206
  15. 14 5月, 2019 1 次提交
  16. 13 5月, 2019 1 次提交
  17. 12 5月, 2019 1 次提交
  18. 10 5月, 2019 1 次提交
    • Q
      Double backward of conv2d. (#17211) · e32c9888
      qingqing01 提交于
      * Add conv2d_grad_grad_op
      * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
          - Now use it in conv2d_grad_grad.
          - Will simply the searching code in conv2d and conv2d_grad in next PR.
      * Enhance and fix bug in unit testing of gradient_checker.
      * Support to fetch empty variables,return None in Python.
      e32c9888
  19. 08 5月, 2019 3 次提交
  20. 07 5月, 2019 1 次提交
    • Cherry-pick benchmark related changes from release/1.4 (#17156) · a72dbe9a
      石晓伟 提交于
      * cherry-pick commit from 88770542
      
      * cherry-pick commit from 3f0b97df
      
      * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn
      
      (cherry picked from commit 8643dbc2)
      
      * Cherry-Pick from 16662 : Anakin subgraph cpu support
      
      (cherry picked from commit 7ad182e1)
      
      * Cherry-pick from 1662, 16797.. : add anakin int8 support
      
      (cherry picked from commit e14ab180)
      
      * Cherry-pick from 16813 : change singleton to graph RegistBlock
      test=release/1.4
      
      (cherry picked from commit 4b9fa423)
      
      * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2
      
      Support ShuffleNet and MobileNet-v2, test=release/1.4
      
      (cherry picked from commit a6fb066f)
      
      * Cherry-pick : anakin subgraph add opt config layout argument #16846
      test=release/1.4
      
      (cherry picked from commit 8121b3ec)
      
      * 1. add shuffle_channel_detect
      
      (cherry picked from commit 6efdea89)
      
      * update shuffle_channel op convert, test=release/1.4
      
      (cherry picked from commit e4726a06)
      
      * Modify symbol export rules
      
      test=develop
      a72dbe9a
  21. 06 5月, 2019 1 次提交
  22. 30 4月, 2019 1 次提交
  23. 25 4月, 2019 1 次提交
  24. 22 4月, 2019 3 次提交
  25. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  26. 18 4月, 2019 1 次提交