- 14 7月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* Compilation optimization
-
- 08 7月, 2022 1 次提交
-
-
由 WangZhen 提交于
* Pybind JitLayer VarBase Function and add python UT * Add multi program load UT * Fix UT place error * Update jit.save param name * Remove some comments * Polish cmakelists * Polish JitLayer in Python * Fix comments
-
- 06 7月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* not run startup program in constructor of StandaloneExecutor * clear interface of standalone executor * clean debug code
-
- 02 7月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* fix init() * delete test_device_context * replace CPUDeviceContext with CPUContext * fix test_scalar * remove dot_op.cc * fix compile
-
- 30 6月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* support scope_guard * fix test
-
由 Ruibiao Chen 提交于
* Remove boost::variant for FetchResultType * Fix pybind errors
-
- 29 6月, 2022 1 次提交
-
-
由 ronnywang 提交于
-
- 28 6月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* boost::variant -> paddle::variant * boost::variant.apply_visit -> paddle::visit * Update pybind_boost_hraders.h * Fix CINN compilation errors * Revert FetchResultType
-
- 27 6月, 2022 2 次提交
-
-
由 Aganlengzi 提交于
* [CustomDevice]add custom place supports * sync format
-
由 Chen Weihang 提交于
* add get_op_names api * Update pybind.cc
-
- 24 6月, 2022 1 次提交
-
-
由 chenjian 提交于
* record memory and op supplement info * update * update * fix a bug * fix memory recording * fix a bug * update * update * fix a bug * update * fix a bug * fix a bug * fix a bug * Revert "fix a bug" This reverts commit c1d4df52762ba9ae7c7e27cd2ba4fc3a7ed9c7a5. * fix a bug * fix format * fix
-
- 16 6月, 2022 1 次提交
-
-
由 ronnywang 提交于
* [CustomKernel] add custom kernel c api * update * update * fix unable to export capi Co-authored-by: Nronny1996 <524019753@qq.com>
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 02 6月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* support CUDAGraph for partial graph * add ut * fix ci * fix ut again because of eager mode * fix kunlun ci * fix win ci
-
- 27 5月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support memory stats for CPU * Add UTs * Fix typos * Fix typos
-
- 16 5月, 2022 1 次提交
-
-
由 jiangcheng 提交于
* optimize cinn find graph by graph address * graph_key use int64_t instead of program string * fix framework _to_readable_code python code * rename get_readable_comile_key to get_serialize_comile_key
-
- 11 5月, 2022 1 次提交
-
-
由 Allen Guo 提交于
* update to popart v2.5.0 * use a specific version of sdk2.5.0
-
- 05 5月, 2022 2 次提交
- 27 4月, 2022 1 次提交
-
-
由 Aganlengzi 提交于
* [DO NOT MERGE] test op_test * update with more related modifications * split op_test.py to use test=allcases for testing * split op_test.py to use test=allcases for testing
-
- 26 4月, 2022 2 次提交
-
-
由 seemingwang 提交于
* extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add dsm sample method * add graph_neighbor_sample_v2 * Add graph_neighbor_sample_v2 * fix for loop * add cpu sample interface * fix kernel judgement * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * change index settings * recover test * recover test * fix spelling * recover * fix * move cudamemcpy after cuda stream sync * fix linking problem * remove comment * add cpu test * test * add cpu test * change comment * combine feature table and graph table * test * test * pybind * test * test * test * test * pybind * pybind * fix cmake * pybind * fix * fix * add pybind * add pybind * optimize pybind * test * fix pybind * fix Co-authored-by: NDesmonDay <908660116@qq.com>
-
由 Leo Chen 提交于
* fit for printing cinn_launch op * update boost::variant caster for bytes
-
- 24 4月, 2022 2 次提交
-
-
由 ronnywang 提交于
-
由 seemingwang 提交于
* extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add dsm sample method * add graph_neighbor_sample_v2 * Add graph_neighbor_sample_v2 * fix for loop * add cpu sample interface * fix kernel judgement * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * change index settings * recover test * recover test * fix spelling * recover * fix * move cudamemcpy after cuda stream sync * fix linking problem * remove comment * add cpu test * test * add cpu test * change comment * combine feature table and graph table * test * test * pybind * test * test * test * test * pybind * pybind * fix cmake * pybind * fix * fix * add pybind * add pybind Co-authored-by: NDesmonDay <908660116@qq.com>
-
- 19 4月, 2022 1 次提交
-
-
由 Zhang Ting 提交于
-
- 17 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* split phi and fluid infermeta context * resolve conflict * fix type error * optimize scheduling perf * spec small vector size * replace all grad var name * fix test failed * move init defalut signature * polish details * polish details * fix no init bug * init sig for tests * add init sig for infer * fix infrt error * fix infrt failed * fix kunlun error * fix infrt failed
-
- 15 4月, 2022 3 次提交
-
-
由 Jack Zhou 提交于
* Add core.eager.StringTensor __init__ which pyarray args can be passed * Add the numpy method of core.eager.StringTensor * revert tensor.to_string modification * Add ToPyObject for core.eager.StringTensor * Add debug string for core.eager.StringTensor * Remove place args of core.eager.StringTensor temporarily * Fix check string_tensor error * remove dtype of core.eager.StringTensor * add core.eager.StringTensor unittest * remove pstring from VarDesc * Add InitStringTensorWithStringTensor * Remove to_string modification * Remove zero_copy arg from StringTensor creator
-
由 limingshu 提交于
* change cudnn helper for auto-tune * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm. * Fix the bug in calculating and printing current step cache hit rate. * Improve the autotune cache and fix unittest. * Change the key from AlgorithmType to int64_t. * Fix unittest for cpu-only env. * change ChooseAlgoByWorkspace for heuristic mode Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 fwenguang 提交于
* [MLU] add mlu new profiler * fix format
-
- 14 4月, 2022 1 次提交
-
-
由 liutiexing 提交于
* executor perf statistics * fix ut * fix ut * fix ut * add ut * add ut
-
- 09 4月, 2022 1 次提交
-
-
由 zhaocaibei123 提交于
* update name * update name * fix test * fix fleet bind * update name * update name * fix test * fix gpups wrapper * remove Push/Pull/Load/Save with context in client and wrapper base class * fix * fix * remove some interface * fix * remove * code style * recover * fix * remove code unused * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable * fix * fix * fix * recover * remove unused code * recover unittest * fix * remove * fix * remove code unuseful * remove * fix * recover * remove Co-authored-by: Nesythan <esythan@126.com>
-
- 08 4月, 2022 1 次提交
-
-
由 ronnywang 提交于
-
- 07 4月, 2022 2 次提交
-
-
由 Thunderbrook 提交于
* afs wrapper * format * format * macro
-
由 liutiexing 提交于
* Profile Executors * update * fix ut * fix names * update * update
-
- 05 4月, 2022 1 次提交
-
-
由 Zhang Ting 提交于
* switch autotune * implement AutoTuneCache * implement AutoTuneCache class * add pybind api * add dygraph test * support static mode and eager mode and improve unittests * rename the SwitchAutoTune Class and improve tests * improve AutoTuneStatus and reduce the cost of tests
-
- 01 4月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* support pinned, test=develop * support async_write, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine,test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
- 30 3月, 2022 1 次提交
-
-
由 From00 提交于
Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657) * Add new API memory_reserved * Add memory_allocated, max_memory_reserved and max_memory_allocater * Fix CI error * Fix CI error * Enhance UT * Add FLAGS_memory_stats_opt * Add STATS macro functions * Add StatAllocator * Fix CI errors * Add UT * Fix CI errors
-
- 23 3月, 2022 2 次提交
-
-
由 Jiabin Yang 提交于
* suppor sharding api * support multi api for sharding in eager * support multi api for sharding in eager * fix test * fix test coverage
-
由 chenjian 提交于
* add event record for model profiling * fix format * fix format * fix code example bug * no * add profiler statistic * add profiler feature * fix bug * fix bug * fix bug * fix bug * required: gpu * required: gpu * fix bug * required: gpu * fix ci bug * fix ci error * fix ci error * upgrade document * fix doc * fix ci bug * add doc and fix bug * nothing * fix bug * fix format bug * modify format * add deprecated description for old profiler * fix bug * fix bug * fix * add load_profiler_reuslt doc * add load_profiler_reuslt doc * add load_profiler_reuslt doc * help fix old profiler sample code * add api doc * fix format * fix api doc * fix api doc format * fix api doc format * fix api doc c format * fix api doc format
-
- 22 3月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* async prepare deps * fix bug that std::future is not set * add ut * refine code * fix standalone ut * disable prof
-