1. 27 6月, 2022 2 次提交
  2. 24 6月, 2022 1 次提交
    • C
      record memory and op supplement info (#43550) · 8dd0a3b9
      chenjian 提交于
      * record memory and op supplement info
      
      * update
      
      * update
      
      * fix a bug
      
      * fix memory recording
      
      * fix a bug
      
      * update
      
      * update
      
      * fix a bug
      
      * update
      
      * fix a bug
      
      * fix a bug
      
      * fix a bug
      
      * Revert "fix a bug"
      
      This reverts commit c1d4df52762ba9ae7c7e27cd2ba4fc3a7ed9c7a5.
      
      * fix a bug
      
      * fix format
      
      * fix
      8dd0a3b9
  3. 16 6月, 2022 1 次提交
  4. 05 6月, 2022 1 次提交
  5. 02 6月, 2022 1 次提交
  6. 27 5月, 2022 1 次提交
  7. 16 5月, 2022 1 次提交
    • J
      optimize cinn find graph by graph address (#42697) · 661d0800
      jiangcheng 提交于
      * optimize cinn find graph by graph address
      
      * graph_key use int64_t instead of program string
      
      * fix framework _to_readable_code python code
      
      * rename get_readable_comile_key to get_serialize_comile_key
      661d0800
  8. 11 5月, 2022 1 次提交
  9. 05 5月, 2022 2 次提交
  10. 27 4月, 2022 1 次提交
  11. 26 4月, 2022 2 次提交
    • S
      optimize graph_engine pybind (#42192) · 1bf08eca
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add dsm sample method
      
      * add graph_neighbor_sample_v2
      
      * Add graph_neighbor_sample_v2
      
      * fix for loop
      
      * add cpu sample interface
      
      * fix kernel judgement
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * change index settings
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * move cudamemcpy after cuda stream sync
      
      * fix linking problem
      
      * remove comment
      
      * add cpu test
      
      * test
      
      * add cpu test
      
      * change comment
      
      * combine feature table and graph table
      
      * test
      
      * test
      
      * pybind
      
      * test
      
      * test
      
      * test
      
      * test
      
      * pybind
      
      * pybind
      
      * fix cmake
      
      * pybind
      
      * fix
      
      * fix
      
      * add pybind
      
      * add pybind
      
      * optimize pybind
      
      * test
      
      * fix pybind
      
      * fix
      Co-authored-by: NDesmonDay <908660116@qq.com>
      1bf08eca
    • L
      fit for printing cinn_launch op (#42141) · ee56906e
      Leo Chen 提交于
      * fit for printing cinn_launch op
      
      * update boost::variant caster for bytes
      ee56906e
  12. 24 4月, 2022 2 次提交
    • R
      [CustomDevice] add eager mode support (#42034) · ccafd2e5
      ronnywang 提交于
      ccafd2e5
    • S
      combine graph_table and feature_table in graph_engine (#42134) · 0e0f7da6
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add dsm sample method
      
      * add graph_neighbor_sample_v2
      
      * Add graph_neighbor_sample_v2
      
      * fix for loop
      
      * add cpu sample interface
      
      * fix kernel judgement
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * change index settings
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * move cudamemcpy after cuda stream sync
      
      * fix linking problem
      
      * remove comment
      
      * add cpu test
      
      * test
      
      * add cpu test
      
      * change comment
      
      * combine feature table and graph table
      
      * test
      
      * test
      
      * pybind
      
      * test
      
      * test
      
      * test
      
      * test
      
      * pybind
      
      * pybind
      
      * fix cmake
      
      * pybind
      
      * fix
      
      * fix
      
      * add pybind
      
      * add pybind
      Co-authored-by: NDesmonDay <908660116@qq.com>
      0e0f7da6
  13. 19 4月, 2022 1 次提交
  14. 17 4月, 2022 1 次提交
    • C
      [Perf] Optimize dygraph scheduling performance (#41696) · 7ee31a96
      Chen Weihang 提交于
      * split phi and fluid infermeta context
      
      * resolve conflict
      
      * fix type error
      
      * optimize scheduling perf
      
      * spec small vector size
      
      * replace all grad var name
      
      * fix test failed
      
      * move init defalut signature
      
      * polish details
      
      * polish details
      
      * fix no init bug
      
      * init sig for tests
      
      * add init sig for infer
      
      * fix infrt error
      
      * fix infrt failed
      
      * fix kunlun error
      
      * fix infrt failed
      7ee31a96
  15. 15 4月, 2022 3 次提交
    • J
      Add eager string tensor (#41039) · a22b68b8
      Jack Zhou 提交于
      * Add core.eager.StringTensor __init__ which pyarray args can be passed
      
      * Add the numpy method of core.eager.StringTensor
      
      * revert tensor.to_string modification
      
      * Add ToPyObject for core.eager.StringTensor
      
      * Add debug string for core.eager.StringTensor
      
      * Remove place args of core.eager.StringTensor temporarily
      
      * Fix check string_tensor error
      
      * remove dtype of core.eager.StringTensor
      
      * add core.eager.StringTensor unittest
      
      * remove pstring from VarDesc
      
      * Add InitStringTensorWithStringTensor
      
      * Remove to_string modification
      
      * Remove zero_copy arg from StringTensor creator
      a22b68b8
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
    • F
      [MLU] add mlu new profiler (#41138) · fc208b7e
      fwenguang 提交于
      * [MLU] add mlu new profiler
      
      * fix format
      fc208b7e
  16. 14 4月, 2022 1 次提交
  17. 09 4月, 2022 1 次提交
    • Z
      Unittest recover (#41431) · 7a07c4a5
      zhaocaibei123 提交于
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      7a07c4a5
  18. 08 4月, 2022 1 次提交
  19. 07 4月, 2022 2 次提交
  20. 05 4月, 2022 1 次提交
    • Z
      Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e
      Zhang Ting 提交于
      * switch autotune
      
      * implement AutoTuneCache
      
      * implement AutoTuneCache class
      
      * add pybind api
      
      * add dygraph test
      
      * support static mode and eager mode and improve unittests
      
      * rename the SwitchAutoTune Class and improve tests
      
      * improve AutoTuneStatus and reduce the cost of tests
      b0f8000e
  21. 01 4月, 2022 1 次提交
    • W
      [Eager] Support pinned (#41035) · f3270fc8
      wanghuancoder 提交于
      * support pinned, test=develop
      
      * support async_write, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine,test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      f3270fc8
  22. 30 3月, 2022 1 次提交
    • F
      Add new APIs for GPU memory monitoring (max_memory_allocated,... · afe02e9d
      From00 提交于
      Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657)
      
      * Add new API memory_reserved
      
      * Add memory_allocated, max_memory_reserved and max_memory_allocater
      
      * Fix CI error
      
      * Fix CI error
      
      * Enhance UT
      
      * Add FLAGS_memory_stats_opt
      
      * Add STATS macro functions
      
      * Add StatAllocator
      
      * Fix CI errors
      
      * Add UT
      
      * Fix CI errors
      afe02e9d
  23. 23 3月, 2022 2 次提交
    • J
      Support sharding (#40637) · fe291daf
      Jiabin Yang 提交于
      * suppor sharding api
      
      * support multi api for sharding in eager
      
      * support multi api for sharding in eager
      
      * fix test
      
      * fix test coverage
      fe291daf
    • C
      Add profiler features (#40357) · c15e3823
      chenjian 提交于
      * add event record for model profiling
      
      * fix format
      
      * fix format
      
      * fix code example bug
      
      * no
      
      * add profiler statistic
      
      * add profiler feature
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * required: gpu
      
      * required: gpu
      
      * fix bug
      
      * required: gpu
      
      * fix ci bug
      
      * fix ci error
      
      * fix ci error
      
      * upgrade document
      
      * fix doc
      
      * fix ci bug
      
      * add doc and fix bug
      
      * nothing
      
      * fix bug
      
      * fix format bug
      
      * modify format
      
      * add deprecated description for old profiler
      
      * fix bug
      
      * fix bug
      
      * fix
      
      * add load_profiler_reuslt doc
      
      * add load_profiler_reuslt doc
      
      * add load_profiler_reuslt doc
      
      * help fix old profiler sample code
      
      * add api doc
      
      * fix format
      
      * fix api doc
      
      * fix api doc format
      
      * fix api doc format
      
      * fix api doc c format
      
      * fix api doc format
      c15e3823
  24. 22 3月, 2022 1 次提交
  25. 21 3月, 2022 1 次提交
  26. 16 3月, 2022 1 次提交
  27. 14 3月, 2022 2 次提交
    • J
      Support custom op and paddle.autograd.bacward in eager (#40423) · 227fa408
      Jiabin Yang 提交于
      * eager, test=develop
      
      * fix bug, test=develop
      
      * eager, test=develop
      
      * merge legacy to fluid
      
      * eager, test=develop
      
      * eager, test=develop
      
      * Refactor TensorAdd func by template and remove gradient_accumulation in eager
      
      * Remove needless target name
      
      * eager, test=develop
      
      * eager, test=develop
      
      * Use overload instead of template
      
      * Remove legacy code
      
      * Remove legacy code
      
      * selectedrows, test=develop
      
      * Remove DataType test
      
      * eager, test=develop
      
      * eager, test=develop
      
      * support gan, test=develop
      
      * Using Tensor directly instead of using EagerTensor
      
      * support gradient_accumulation
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * refine code
      
      * ptb, test=develop
      
      * Rename all EagerTensor to Tensor
      
      * Rename some EagerTensor to Tensor
      
      * rename EagerTensor to EagerVariable
      
      * eager, test=develop
      
      * eager, test=develop
      
      * eager, test=develop
      
      * eager, test=develop
      
      * add more test
      
      * eager, test=develop
      
      * Support copiable selected rows and merge develop
      
      * save load, eager, test=develop
      
      * save load, eager, test=develop
      
      * refine, test=develop
      
      * remove useless _set_value method
      
      * refine, test=develop
      
      * refine, test=develop
      
      * revert static_runner, test=develop
      
      * EagerTensor to Tensor, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * clear grad, test=develop
      
      * merge, develop
      
      * merge, develop
      
      * merge, test=develop
      
      * merge, test=develop
      
      * Support quant and part of slice
      
      * support legacy static save
      
      * extend slim tests time
      
      * remove imperative on inference
      
      * remove imperative on inference
      
      * merge develop
      
      * fix typo
      
      * fix typo
      
      * split slice related code into 2 part for imperative and eager
      
      * split slice from inference
      
      * split slice from inference
      
      * fix test_tensor_register_hook
      
      * support custom op in eager mode
      
      * fix inference deps error
      
      * split eager utils from custom operator
      
      * fix type match
      
      * fix typo
      Co-authored-by: NWang Huan <wanghuan29@baidu.com>
      Co-authored-by: NWeilong Wu <veyron_wu@163.com>
      Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>
      227fa408
    • Z
      [multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors ... · e553f758
      Zhong Hui 提交于
      [multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors  between python processes. (#37302)
      
      * Add support for paddle.multiprocessing
      * move multiprocessing to incubate.
      e553f758
  28. 12 3月, 2022 1 次提交
  29. 10 3月, 2022 1 次提交
  30. 09 3月, 2022 1 次提交
  31. 08 3月, 2022 1 次提交
    • C
      add python profiler package (#40065) · 10325a82
      chenjian 提交于
      * add python profiler package
      
      * update according to review
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * add unit test
      
      * Revert "add unit test"
      
      This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48.
      
      * reduce for pr
      
      * add unit test
      
      * modify for pr
      
      * fix unittest
      
      * update for ci coverage
      
      * modify according to review
      
      * fix bug
      
      * improve coverage
      10325a82