- 29 7月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* remove cudaDeviceContext * remove more template * fix rocm compile
-
由 Leo Chen 提交于
* init * move CUDAStream to phi * fix compilation * merge develop * add stream_owned_ member * split cuda_stream.h * fix cpu compile * fix constructor * fix bug * fix windows compile * fix inference test_levit * fix windows tests
-
- 19 7月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* compile into one static library * fix xpu compile * fix xpu compile * fix inference compile * fix inference compile * add custom test * revert one file
-
- 14 7月, 2022 2 次提交
- 11 7月, 2022 1 次提交
-
-
由 王明冬 提交于
-
- 06 7月, 2022 1 次提交
-
-
由 houj04 提交于
-
- 04 7月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
-
- 26 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 24 6月, 2022 2 次提交
-
-
由 王明冬 提交于
-
由 chenjian 提交于
* record memory and op supplement info * update * update * fix a bug * fix memory recording * fix a bug * update * update * fix a bug * update * fix a bug * fix a bug * fix a bug * Revert "fix a bug" This reverts commit c1d4df52762ba9ae7c7e27cd2ba4fc3a7ed9c7a5. * fix a bug * fix format * fix
-
- 14 6月, 2022 1 次提交
-
-
由 Wilber 提交于
* cmake-lint * update
-
- 10 6月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Refactor DeviceContextPool * Adjust header file order
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 04 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 02 6月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* support CUDAGraph for partial graph * add ut * fix ci * fix ut again because of eager mode * fix kunlun ci * fix win ci
-
- 01 6月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Add pinned memory to HostMemoryStats * Add macro for WrapStatAllocator * Fix CI errors
-
- 27 5月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support memory stats for CPU * Add UTs * Fix typos * Fix typos
-
- 19 5月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* refine enforce code * refine enforce code * fix compile failed * fix infrt failed
-
- 27 4月, 2022 1 次提交
-
-
由 Aganlengzi 提交于
* [DO NOT MERGE] test op_test * update with more related modifications * split op_test.py to use test=allcases for testing * split op_test.py to use test=allcases for testing
-
- 25 4月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
-
- 07 4月, 2022 1 次提交
-
-
由 liutiexing 提交于
* Profile Executors * update * fix ut * fix names * update * update
-
- 05 4月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* enable new executor by default * enable stream safe allocator * test=document_fix;test=coverage * do not use scope in op kernel * fit empty program for new executor * fix communication depend * fix test_sync_batch_norm * skip unsupported place * refine datatransfer * fit for dirtributed program * fix dependencpy * fix some ut
-
- 01 4月, 2022 1 次提交
-
-
由 From00 提交于
* Fix compilation error for gcc-54 * Remove const for gpuStream_t
-
- 30 3月, 2022 1 次提交
-
-
由 From00 提交于
Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657) * Add new API memory_reserved * Add memory_allocated, max_memory_reserved and max_memory_allocater * Fix CI error * Fix CI error * Enhance UT * Add FLAGS_memory_stats_opt * Add STATS macro functions * Add StatAllocator * Fix CI errors * Add UT * Fix CI errors
-
- 27 3月, 2022 1 次提交
-
-
由 From00 提交于
* Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy * Set FLAGS_use_stream_safe_cuda_allocator to false * Update * Remove unnecessary code * Fix CI errors * Add UT
-
- 25 3月, 2022 1 次提交
-
-
由 z8hanghuan 提交于
* support multi_dims for tril_triu, *test=kunlun * support multi_dims for tril_triu, *test=kunlun * support multi_dims for tril_triu, *test=kunlun * update xpu.cmake date, support multi_dims for tril_triu, *test=kunlun
-
- 23 3月, 2022 1 次提交
-
-
由 From00 提交于
* Performance optimize * Optimize GetAllocator, RWLock and ProcessUnfreedAllocation * Remove test file * Fix CI error * Fix CI errors * Fix CI errors
-
- 18 3月, 2022 1 次提交
-
-
由 Aganlengzi 提交于
-
- 14 3月, 2022 1 次提交
-
-
由 Zhong Hui 提交于
[multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors between python processes. (#37302) * Add support for paddle.multiprocessing * move multiprocessing to incubate.
-
- 03 3月, 2022 2 次提交
- 28 2月, 2022 1 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * add log for Executor * Profile Allocators * Profile Allocators * adjust interface * remove lock for set * fix Co-authored-by: Nliutiexing <liutiexing@google.com>
-
- 25 2月, 2022 1 次提交
-
-
由 Qi Li 提交于
* [ROCm] fix Managed Memory Alloc on HIP, test=develop * update, test=develop
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 15 2月, 2022 1 次提交
-
-
由 ronnywang 提交于
* [CustomRuntime] Add DeviceManager * [CustomRuntime] Add DeviceInterface * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager * [CustomRuntime] Add plug-in device * [CustomRuntime] Memory module support PluggableDevice * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option * update * [API] update API doc based on comments, test=develop Co-authored-by: Nqili93 <qili93@qq.com>
-
- 09 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
-
- 08 2月, 2022 1 次提交
-
-
由 From00 提交于
* Rough implementation for experiment * Support allocate cuda managed memory * Fix CI error * Modify UT * Check whether support memory oversubscription * Fix ROCM Compile error * Fix ROCM Compile error * Fix UT cuda_managed_memory_test * Set UT timeout to 40 * Add UT OOMExceptionTest * Set UT timeout to 50
-
- 06 2月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 27 1月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Support allocate_from in Tensor and allocate_data in Context * fix #ifdef CUDA * fix cycle depends * fix test_xxx_dev_api failed * fix windows compiling error * fix unittest * modify into PImpl * fix selected rows * add TODO comment * refine interface according reviewer
-