1. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  2. 26 7月, 2022 1 次提交
  3. 21 7月, 2022 1 次提交
  4. 19 7月, 2022 1 次提交
  5. 14 7月, 2022 1 次提交
  6. 08 7月, 2022 1 次提交
  7. 06 7月, 2022 1 次提交
    • L
      Refine StandaloneExecutor (#44076) · 6eed9f49
      Leo Chen 提交于
      * not run startup program in constructor of StandaloneExecutor
      
      * clear interface of standalone executor
      
      * clean debug code
      6eed9f49
  8. 02 7月, 2022 1 次提交
    • L
      unify cpu context, part2 (#44012) · 755438a7
      Leo Chen 提交于
      * fix init()
      
      * delete test_device_context
      
      * replace CPUDeviceContext with CPUContext
      
      * fix test_scalar
      
      * remove dot_op.cc
      
      * fix compile
      755438a7
  9. 30 6月, 2022 2 次提交
  10. 29 6月, 2022 1 次提交
  11. 28 6月, 2022 1 次提交
    • R
      Remove boost::variant (#43100) · b3cf28f8
      Ruibiao Chen 提交于
      * boost::variant -> paddle::variant
      
      * boost::variant.apply_visit -> paddle::visit
      
      * Update pybind_boost_hraders.h
      
      * Fix CINN compilation errors
      
      * Revert FetchResultType
      b3cf28f8
  12. 27 6月, 2022 2 次提交
  13. 24 6月, 2022 1 次提交
    • C
      record memory and op supplement info (#43550) · 8dd0a3b9
      chenjian 提交于
      * record memory and op supplement info
      
      * update
      
      * update
      
      * fix a bug
      
      * fix memory recording
      
      * fix a bug
      
      * update
      
      * update
      
      * fix a bug
      
      * update
      
      * fix a bug
      
      * fix a bug
      
      * fix a bug
      
      * Revert "fix a bug"
      
      This reverts commit c1d4df52762ba9ae7c7e27cd2ba4fc3a7ed9c7a5.
      
      * fix a bug
      
      * fix format
      
      * fix
      8dd0a3b9
  14. 16 6月, 2022 1 次提交
  15. 05 6月, 2022 1 次提交
  16. 02 6月, 2022 1 次提交
  17. 27 5月, 2022 1 次提交
  18. 16 5月, 2022 1 次提交
    • J
      optimize cinn find graph by graph address (#42697) · 661d0800
      jiangcheng 提交于
      * optimize cinn find graph by graph address
      
      * graph_key use int64_t instead of program string
      
      * fix framework _to_readable_code python code
      
      * rename get_readable_comile_key to get_serialize_comile_key
      661d0800
  19. 11 5月, 2022 1 次提交
  20. 05 5月, 2022 2 次提交
  21. 27 4月, 2022 1 次提交
  22. 26 4月, 2022 2 次提交
    • S
      optimize graph_engine pybind (#42192) · 1bf08eca
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add dsm sample method
      
      * add graph_neighbor_sample_v2
      
      * Add graph_neighbor_sample_v2
      
      * fix for loop
      
      * add cpu sample interface
      
      * fix kernel judgement
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * change index settings
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * move cudamemcpy after cuda stream sync
      
      * fix linking problem
      
      * remove comment
      
      * add cpu test
      
      * test
      
      * add cpu test
      
      * change comment
      
      * combine feature table and graph table
      
      * test
      
      * test
      
      * pybind
      
      * test
      
      * test
      
      * test
      
      * test
      
      * pybind
      
      * pybind
      
      * fix cmake
      
      * pybind
      
      * fix
      
      * fix
      
      * add pybind
      
      * add pybind
      
      * optimize pybind
      
      * test
      
      * fix pybind
      
      * fix
      Co-authored-by: NDesmonDay <908660116@qq.com>
      1bf08eca
    • L
      fit for printing cinn_launch op (#42141) · ee56906e
      Leo Chen 提交于
      * fit for printing cinn_launch op
      
      * update boost::variant caster for bytes
      ee56906e
  23. 24 4月, 2022 2 次提交
    • R
      [CustomDevice] add eager mode support (#42034) · ccafd2e5
      ronnywang 提交于
      ccafd2e5
    • S
      combine graph_table and feature_table in graph_engine (#42134) · 0e0f7da6
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add dsm sample method
      
      * add graph_neighbor_sample_v2
      
      * Add graph_neighbor_sample_v2
      
      * fix for loop
      
      * add cpu sample interface
      
      * fix kernel judgement
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * change index settings
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * move cudamemcpy after cuda stream sync
      
      * fix linking problem
      
      * remove comment
      
      * add cpu test
      
      * test
      
      * add cpu test
      
      * change comment
      
      * combine feature table and graph table
      
      * test
      
      * test
      
      * pybind
      
      * test
      
      * test
      
      * test
      
      * test
      
      * pybind
      
      * pybind
      
      * fix cmake
      
      * pybind
      
      * fix
      
      * fix
      
      * add pybind
      
      * add pybind
      Co-authored-by: NDesmonDay <908660116@qq.com>
      0e0f7da6
  24. 19 4月, 2022 1 次提交
  25. 17 4月, 2022 1 次提交
    • C
      [Perf] Optimize dygraph scheduling performance (#41696) · 7ee31a96
      Chen Weihang 提交于
      * split phi and fluid infermeta context
      
      * resolve conflict
      
      * fix type error
      
      * optimize scheduling perf
      
      * spec small vector size
      
      * replace all grad var name
      
      * fix test failed
      
      * move init defalut signature
      
      * polish details
      
      * polish details
      
      * fix no init bug
      
      * init sig for tests
      
      * add init sig for infer
      
      * fix infrt error
      
      * fix infrt failed
      
      * fix kunlun error
      
      * fix infrt failed
      7ee31a96
  26. 15 4月, 2022 3 次提交
    • J
      Add eager string tensor (#41039) · a22b68b8
      Jack Zhou 提交于
      * Add core.eager.StringTensor __init__ which pyarray args can be passed
      
      * Add the numpy method of core.eager.StringTensor
      
      * revert tensor.to_string modification
      
      * Add ToPyObject for core.eager.StringTensor
      
      * Add debug string for core.eager.StringTensor
      
      * Remove place args of core.eager.StringTensor temporarily
      
      * Fix check string_tensor error
      
      * remove dtype of core.eager.StringTensor
      
      * add core.eager.StringTensor unittest
      
      * remove pstring from VarDesc
      
      * Add InitStringTensorWithStringTensor
      
      * Remove to_string modification
      
      * Remove zero_copy arg from StringTensor creator
      a22b68b8
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
    • F
      [MLU] add mlu new profiler (#41138) · fc208b7e
      fwenguang 提交于
      * [MLU] add mlu new profiler
      
      * fix format
      fc208b7e
  27. 14 4月, 2022 1 次提交
  28. 09 4月, 2022 1 次提交
    • Z
      Unittest recover (#41431) · 7a07c4a5
      zhaocaibei123 提交于
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      7a07c4a5
  29. 08 4月, 2022 1 次提交
  30. 07 4月, 2022 2 次提交
  31. 05 4月, 2022 1 次提交
    • Z
      Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e
      Zhang Ting 提交于
      * switch autotune
      
      * implement AutoTuneCache
      
      * implement AutoTuneCache class
      
      * add pybind api
      
      * add dygraph test
      
      * support static mode and eager mode and improve unittests
      
      * rename the SwitchAutoTune Class and improve tests
      
      * improve AutoTuneStatus and reduce the cost of tests
      b0f8000e
  32. 01 4月, 2022 1 次提交
    • W
      [Eager] Support pinned (#41035) · f3270fc8
      wanghuancoder 提交于
      * support pinned, test=develop
      
      * support async_write, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine,test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      f3270fc8