1. 25 8月, 2022 1 次提交
    • H
      optimize conv algo cache (#41891) · 1cd7e68b
      hong 提交于
      * optimizer conv alog speed
      
      * code polish
      
      * remove useless code
      
      * fix compile error
      
      * fix cpu compile error
      
      * not use cudnn alog t
      
      * add search cache max number
      
      * polish code
      
      * fix cache test bug
      
      * add groups data format to conv args
      
      * fix cache test bug
      
      * fix cudnn_deterministic bug
      
      * fix test switch auto tune bug
      
      * fix test swith autotune bug;
      
      * fix conv cache bug
      
      * fix cache test error
      
      * fix cache test bug
      
      * fix windows mac compile error
      
      * fix workspace search error
      
      * update cudnn cache
      
      * fix cache test bug; test=develop
      
      * fix autotune swith test error
      
      * polish code
      
      * oplish code
      1cd7e68b
  2. 01 7月, 2022 1 次提交
    • L
      Addition of switch_auto_tune option for transpose op (#43310) · 53d5abe3
      limingshu 提交于
      * 2nd part of transpose update
      
      * add switch_auto_tune option.
      
      * add some changes according to Ci
      
      * refine the structure of auto_tune_base.
      
      * merge develop changes
      
      * reset the switch_set_range and change unittest of transpose auto-tune
      
      * change the kernel auto-tune logits
      53d5abe3
  3. 05 6月, 2022 1 次提交
  4. 15 4月, 2022 1 次提交
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
  5. 05 4月, 2022 1 次提交
    • Z
      Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e
      Zhang Ting 提交于
      * switch autotune
      
      * implement AutoTuneCache
      
      * implement AutoTuneCache class
      
      * add pybind api
      
      * add dygraph test
      
      * support static mode and eager mode and improve unittests
      
      * rename the SwitchAutoTune Class and improve tests
      
      * improve AutoTuneStatus and reduce the cost of tests
      b0f8000e
  6. 03 3月, 2022 1 次提交
  7. 25 2月, 2022 1 次提交
    • 0
      move eye、size、erfinv、pixel_shuffle OP to phi (#39712) · 639675de
      0x45f 提交于
      * move eye OP to pten
      
      * move size OP to pten
      
      * merge develop
      
      * fix merge
      
      * move files
      
      * move erfinv OP to phi
      
      * remove comment
      
      * move pixel_shuffle OP to phi
      
      * remove comment
      
      * fix PT_REGISTER
      
      * fix NPU
      
      * fix CR
      
      * remove size_sig.cc for PR-CI-Coverage
      639675de
  8. 20 2月, 2022 1 次提交
  9. 15 2月, 2022 1 次提交
    • F
      Move Abs OP to pten (#39492) · fb473067
      From00 提交于
      * Move Abs op to pten
      
      * Fix NPU compilation error
      
      * Fix CI error
      
      * Use LaunchSameDimsElementwiseCudaKernel in pten
      fb473067
  10. 28 1月, 2022 1 次提交
    • H
      Move digamma to pten (#39240) · 848ae7dc
      hong 提交于
      * move digamma to pten; test=develop
      
      * fix mutable_data bugs; test=develop
      
      * remove useless code; test=develop
      
      * remove kernel compute; test=develop
      
      * fix bug; test=develop
      848ae7dc
  11. 17 1月, 2022 1 次提交
  12. 12 1月, 2022 1 次提交
    • Z
      the_one_ps dirs reconstruct (#38804) · 50609214
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      50609214
  13. 03 11月, 2021 1 次提交
  14. 18 9月, 2021 1 次提交
    • H
      Basic PR on Cost Model (#35774) · 5ba9fe6e
      Huihuang Zheng 提交于
      Add basic Cost Model, it uses executor to run program and profile it to get op time.
      
      This is an early basic version, we will add more functions in the future.
      5ba9fe6e
  15. 15 9月, 2020 1 次提交
  16. 03 6月, 2020 1 次提交
    • Y
      Add crypto python (#24836) · aa47356b
      Yanghello 提交于
      * add crypto helper for paddle, test=develop
      
      * cryptopp.cmake bug fixed, test=develop
      
      * remove debug build type, test=develop
      
      * fixed CMakeLists for new target, test=develop
      
      * fix CI bug, test=develop
      
      * add cmake option flag DWITH_CRYPTO, test=develop
      
      * add crypto api for python, test=develop
      
      * Revert "add crypto api for python, test=develop"
      
      This reverts commit 3a1cfa9d.
      
      * Revert "Add crypto api (#24694)"
      
      This reverts commit 5a7a517c.
      
      * Revert "Revert "Add crypto api (#24694)""
      
      This reverts commit f952b19f.
      
      * fixed cryptopp cmake building error, test=develop
      
      * change WITH_CRYPTO building option to OFF, test=develop
      
      * â€fixed cipher test failed, test=develop
      
      * "add crypto api for python, test=develop"
      
      This reverts commit 83fb55c0.
      
      * travis CI bug fixed, test=develop
      
      * fixed test in python3
      
      * test=develop
      
      * fixed unittest, test=develop
      aa47356b
  17. 21 1月, 2019 1 次提交
  18. 10 1月, 2019 1 次提交
  19. 13 12月, 2018 1 次提交
    • S
      fix cmake · deb0d41c
      sneaxiy 提交于
      fix cmake again
      test=develop
      deb0d41c
  20. 10 12月, 2018 1 次提交
  21. 10 9月, 2018 1 次提交
  22. 18 6月, 2018 1 次提交
  23. 24 5月, 2018 1 次提交
  24. 23 5月, 2018 1 次提交
  25. 22 3月, 2018 1 次提交
  26. 07 3月, 2018 2 次提交
  27. 06 3月, 2018 2 次提交
  28. 15 2月, 2018 1 次提交
    • Y
      Update tensor_util.h (#8422) · cfffb1a3
      Yi Wang 提交于
      * Update tensor_util.h
      
      * Update with moved TensorDesc
      
      * Fix tensur_utils.cu
      
      * Update
      
      * Update
      
      * Update
      
      * Update
      
      * Make tensor_util.cu a symbolic link
      cfffb1a3
  29. 10 2月, 2018 2 次提交
  30. 07 2月, 2018 1 次提交
  31. 06 2月, 2018 2 次提交
  32. 01 2月, 2018 1 次提交
  33. 31 1月, 2018 1 次提交
  34. 30 1月, 2018 1 次提交