- 25 11月, 2021 14 次提交
-
-
由 furnace 提交于
* [NPU] add int64 support for argsort op * [NPU] delete debug codes
-
由 furnace 提交于
* [NPU] add NPU kernel for prior_box op * [NPU] delete debug codes
-
由 Yiqun Liu 提交于
-
由 Zhen Wang 提交于
-
由 wuhuanzhou 提交于
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * [heterps]bug fix for _run_from_dataset * fix heter_server.cc * fix launch_utils.py * fix heter_section_worker.cc * fix. test=develop * fix. test=develop
-
由 From00 提交于
* Support multi-stream allocation for CUDA place * Do not notify the retrying from other streams when free CUDA allocation * Fix compile error for CPU * Fix compile error for HIP * Release memory for StreamSafeCUDAAllocaRetry in malloc_test * Add FLAGS_use_stream_safe_cuda_allocator * Fix CI error for 'set_tests_properties' * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator * Add UT for alloc interface * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
-
由 WangXi 提交于
-
由 tianshuo78520a 提交于
* Fix static-ci
-
由 Zhanlue Yang 提交于
* Added GradTensorHolder to Eager Dygraph * Added accumulation codes to Eager Dygraph * Fix windows-ci issue * Fix NPU-CI issue * Fixed CI-Coverage issue
-
由 LiYuRio 提交于
-
由 xiongkun 提交于
* clear LoDTensorArray * fix bugs * fix * fix gpu
-
由 Wangzheee 提交于
-
由 Chen Weihang 提交于
-
- 24 11月, 2021 16 次提交
-
-
由 piotrekobiIntel 提交于
* Add second batch of deprecated mkldnn namespace and macro changes * Unlock CI * Fix temporary namespace alias placing
-
由 Yuang Liu 提交于
-
由 Zhanlue Yang 提交于
* Added EagerUtils to Eager Dygraph * Purified include dependencies for global_utils * Fixed merge conflicts
-
由 Aurelius84 提交于
-
由 Leo Chen 提交于
-
由 YuanRisheng 提交于
* elementwise_mul refactor * perfect code in test * delete redundant code * fix bugs when run test_multiply * adjust the location of macro * fix bugs when run ci
-
由 zyfncg 提交于
* add scalar and scalar_array * remove DenseTensor include from Scalar and ScalarArray * remove inner header from scalar_array * refactor the method of fill_constant and add some comment
-
由 Wangzheee 提交于
* matmul_convert_int8 * matmul_convert_int8 * matmulconvert_int8 * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor
-
由 zhaoyingli 提交于
* adapt auto search * adapt auto search * fix matmulv2 compatible * del debug
-
由 Aurelius84 提交于
-
由 Chen Weihang 提交于
* standarded unittest namespace * fix detail error
-
由 0x45f 提交于
* run dy2stat pure fp16 in Linear model * no use self._pure_fp16_inputs * add test and fix Adam error in dy2stat pure fp16 training * use paddle.optimizer.Adam * run test in gpu * change test time for CI * enlarge atol for test_resnet_pure_fp16 * refine code and enlarge atol * make custom_white_list and custom_black_list take effect for AMP and pure fp16 * check tracer is not None * use default atol * change filter_size * change atol and add some NOTE
-
由 zhupengyang 提交于
-
由 feng_shuai 提交于
-
由 WangXi 提交于
-
由 Jiabin Yang 提交于
* Add EagerTensor and tests * remove useless enforce * remove comment in cmake * support autograd meta * support grad node info test * support grad_node_info * add more edge test * remove Python.h * add tensor wrapper with tests * support compute require grad and stop gradient * support sync methods and global utils * support pure cpu test * refine error msg * refine error msg * refine error info * fix npu error
-
- 23 11月, 2021 10 次提交
-
-
由 pangyoki 提交于
* fix inplace bug * fix custom grad input error * add unittest * fix inplace bug
-
由 Qi Li 提交于
* [XPU] Reorganize xpu device codes in platform, test=develop * fix xpu_header.h, test=develop
-
由 Li Min 提交于
Add support for bias is none for fused_attention op.
-
由 wanghuancoder 提交于
-
由 Yuang Liu 提交于
-
由 tianshuo78520a 提交于
-
由 Feiyu Chan 提交于
-
由 wangxinxin08 提交于
* modify code about fp16 of dcnv2 trt
-
由 Zhanlue Yang 提交于
-
由 Leo Chen 提交于
* sync scope and variable_scope when init executor * set var_desc for new var
-