- 14 7月, 2023 1 次提交
-
-
由 zhangbo9674 提交于
* add inplace interface * support inplace * refine code * fix bug * fix bug * refien code * add file * add interface * refine code * refine code * add phi kernel instruction * refine code * add test * delete unuse code * add test * add test * add deps * delete unused code * fix bug * fix bug
-
- 05 7月, 2023 1 次提交
-
-
由 zhangbo9674 提交于
* add local scope * refine code * refien code * refine code * support local scope for BuildFuncList * fix bug * add log * fix bug * polish code * fix bug
-
- 19 6月, 2023 1 次提交
-
-
由 Aurelius84 提交于
[NewExe]Polish InterpreterCore with PImpl and Derived ProgramInterpreter and NewIRInterpreter (#54651) * [NewExe]Polish InterpreterCore with PImpl fix code style add std::move * fix conflict * fix typo * fix typo
-
- 16 6月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Run plan in standalone executor * Update codes * Update atol and rtol for py3-CI * Add scope to cache key * Fix CI errors * Fix code style * Update codes * Remove fetch_name in standalone executor * Fix UT * Update codes * Fix new IR bug
-
- 15 6月, 2023 1 次提交
-
-
由 hong 提交于
* add kernel dialect * change DenseTensorTypeStorage to DenseTensorType * add test case` * add first pd_op to kernel dialect * lower pd op to kernel dialect * update * update * remove useless code * add attrite print test * fix bug * update * update * update * update * polish code * fix bug * polish code and add python test * add test * fix test error * add env flag * fix bug * revert test env * change cc_test_old to cc_test * fix build_static bug * fix type test error * udpate cmake * disable test in windows * fix inference compile
-
- 08 6月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 31 5月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
-
- 30 5月, 2023 1 次提交
-
-
由 Leo Chen 提交于
* add timer to log deps * rename flag * add ut
-
- 10 4月, 2023 1 次提交
-
-
由 kangguangli 提交于
* add strategy force_sequential_run * remove flag * fix * fix * fix * fix * fix * fix * fix * fix * fix
-
- 02 3月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Check structed kernel for new executor static build * Update code * Ready for resnet50 * Move transfer_dtype to phi * Ready for transformer * Fix CI errors * Fix layer_norm InferMeta * Remove layer_norm infermeta fix
-
- 16 2月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Use StandaloneExecutor in FleetExecutor * Update FLAGS * Fix CI errors * Update code * Add force_root_scope_vars config * Update code * Fix CI errors * Fix test_layer_new errors
-
- 30 1月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support stream priority for standalone executor * Fix compile error * Fix compile error * Fix compile error * Fix compile error * Fix compile error
-
- 17 1月, 2023 1 次提交
-
-
由 pangyoki 提交于
* new exe supports CUDA Graph * fix * fix * fix * fix FLAGS_use_stream_safe_cuda_allocator in unittest * insert output of coalesce_tensor op to skip_gc_var * fix
-
- 28 12月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* add skip run * alloc minimum memory * skip check_size in Alloc * skip check_size in Alloc * skip check_size in Alloc * fix cases when tensor is initialized or empty * alloc empty output for place info * add test * increase timeout * format code * skip cpu * add cudnn_deterministic * fit for hostAlloc * follow comments * change check_size to fake_alloc
-
- 27 12月, 2022 2 次提交
-
-
由 zhangbo9674 提交于
* cinn use interpretercore * fix bug * fix compile bug * fix scope bug * refine code * refine code by comment * refine code by comment
-
由 Ruibiao Chen 提交于
* Support priority scheduling for standalone executor * Add CPU test
-
- 28 11月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* add trace mode for interpretercore * fix bug * add a ctrl flag * add record for memcpyd2h * polish code * polish code
-
- 26 11月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* hot fix * fix compile * merge develop * follow comments
-
- 25 11月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Move stream_anayzer to interpreter * Refactor StreamAnalyzer * Refactor RunNextInstructionList * Remove no_data_transform_index * Fix typos * Fix data_transfer OpFuncType error * Add event for depend_op * Update transfer OpFuncType for heter place
-
- 02 11月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Dispath computation OPs before communication in standalone executor * Update code * Fix CI errors
-
- 31 10月, 2022 1 次提交
-
-
由 kangguangli 提交于
* replace executor in conditional_block_op.run with standalone_executor * add block_id as the argument of standalone executor's method run; add print for program * fix scope bug about conditional block op * fix bug: unnecessary return of fetch value * fix typo * fix: quantization will set variable persistable, and these variables must exist in global scope * add interpretercore cache for conditional block op but not activate in default * fix bug: local scope reuse for conditional block op * reset scope when conditional block op runs * fix typo * fix typo and code style * add build scope for conditional block op * add skip for transfer_layout kernel * refind code * fix reset_scope * fix reset_scope * refine code * refine code * refine code 1. remove flag use in conditional_block_op 2. pass execution_config to BuildOpFuncList instead of individual parameter * refine code * remove the use of FLAGS_control_flow_use_new_executor_cache * change FLAGS_control_flow_use_new_executor to false
-
- 19 10月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support stream overlap for c_allreduce_sum * Test CI * Add notes * Add SingleStreamGuard for BuildOpFuncList
-
- 12 10月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* refactor * refine code
-
- 11 10月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* remove using lodtensor part1 * polish history code format
-
- 10 10月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* reduce time cost on atomic in interpretercore * clear code of PrepareAtomic in interpretercore * refine threadpool cache
-
- 23 9月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Add ExecutionConfig and fix last-live-op bug for standalone executor * Improve code design
-
- 20 9月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* add config * add config * follow comments * fix serial run
-
- 29 8月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* add interpretercore * refine backward program id * add code * refine program * refine code * create forward/backward_program by prog2graph2prog method * test, do not care * refine code * refine code * refine code * test, do not care * add interpretorcore * add scope * refine scope create method * add jit for new_exe * solve conflict * delete unused code * polish code * polish code * refine scope in inplace * refine for datatransfer * refine _rebuild_from_desc * refine control eager deletion attr * refine used_for_jit * refine jit for infer * op size0 use ori program * polish code * refine jit * refine run_program_op ut * refine inplace * refine control * refine graph helper * refine control * refine inplace * refine buffer_share_inplace_pass * polish code * polish code * refine usage for compilerProgram * refine control * test * test core cache * refine code * refine io.py * increase test_seq2seq timeout * refine convert program * refine interpretercore_cache release * delete buildinplace * refine partial_program && io * refine code for io * test * test * test
-
- 04 8月, 2022 1 次提交
-
-
由 王明冬 提交于
-
- 02 8月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Refactor build_op_downstream_map for standalone executor * Add some comments
-
- 29 6月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* separate variable scope and scope * hot fix for lod_tensor_blocking_queue * fix bug that variable exists in global scope
-
- 23 6月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Multiplex Workqueue for InterpreterCore * Delete ResetWorkQueueOptions * Update code format
-
- 16 6月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support disable GC for some vars in standalone executor * Setting skip_gc_vars in interprecore construction
-
- 18 4月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* shrink downstream map * shrink last live ops of var * add comment * fix bug
-
- 22 3月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* async prepare deps * fix bug that std::future is not set * add ut * refine code * fix standalone ut * disable prof
-
- 17 2月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* relocate code of interpretercore gc
-
- 28 12月, 2021 2 次提交
-
-
由 From00 提交于
* fix reshape move storage error * remove needless set type * alloc tensor by shared storage * Utilize StreamSafeCUDAAllocator to support fast GC in new executor * Fix compile error for Windows and ROCm * Fix compile error for Windows * Modify UT stream_safe_cuda_alloc_test * Modify UT stream_safe_cuda_alloc_test * Rewrite fast GC * Rewrite fast GC * Fix compile error for BOOST_GET_CONST * Fix compile error for BOOST_GET_CONST * Changes default stream for StreamSafeCUDAAllocator * Fix a small CI error * Remove some redundant code * Fix conflict * Fix compile error for ROCm * Fix Windoes CI error * Fix CI error * Remove some unnecessary code * Fix CI error * Add UT for fast GC * Fix CI error * add device-agnostic stream class * add stream.h * fix ut * fix cpu compile * Use RWLock in GetAllocator * Fix CI error Co-authored-by: NChen Weihang <chenweihang@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 Leo Chen 提交于
* add completion_nofifier * fix bug * unregist event waiter
-
- 23 12月, 2021 1 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update EventsWater * fix * split workqueue files * add more tests * fix * bugfix * bugfix * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
- 26 11月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* clear local scope every setp, test=develop * refine,test=develop * refine, test=develop
-