- 25 11月, 2021 21 次提交
-
-
由 furnace 提交于
* [NPU] add int64 support for argsort op * [NPU] delete debug codes
-
由 furnace 提交于
* [NPU] add NPU kernel for prior_box op * [NPU] delete debug codes
-
由 Yiqun Liu 提交于
-
由 Zhen Wang 提交于
-
由 wuhuanzhou 提交于
-
由 Baibaifan 提交于
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * [heterps]bug fix for _run_from_dataset * fix heter_server.cc * fix launch_utils.py * fix heter_section_worker.cc * fix. test=develop * fix. test=develop
-
由 From00 提交于
* Support multi-stream allocation for CUDA place * Do not notify the retrying from other streams when free CUDA allocation * Fix compile error for CPU * Fix compile error for HIP * Release memory for StreamSafeCUDAAllocaRetry in malloc_test * Add FLAGS_use_stream_safe_cuda_allocator * Fix CI error for 'set_tests_properties' * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator * Add UT for alloc interface * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
-
由 Sing_chan 提交于
* block unknown option /arch:SSE3 * modify according to zhouwei's comment
-
由 zhouweiwei2014 提交于
* add new API paddle.nn.initializer.Dirac * fix doc
-
由 Leo Chen 提交于
* fix program cache key * bug fix * fix cache problem * remove unused code
-
由 WangXi 提交于
-
由 tianshuo78520a 提交于
* Fix static-ci
-
由 Zhanlue Yang 提交于
* Added GradTensorHolder to Eager Dygraph * Added accumulation codes to Eager Dygraph * Fix windows-ci issue * Fix NPU-CI issue * Fixed CI-Coverage issue
-
由 LiYuRio 提交于
-
由 Chen Weihang 提交于
* hot fix for dataloader thread error * polish comment * fix type in comment, test=document_fix
-
由 xiongkun 提交于
* clear LoDTensorArray * fix bugs * fix * fix gpu
-
由 Matsumoto GAO 提交于
* add zeropad2d v0.1 * add zeropad2d v0.2 * add zeropad2d v0.3 * add zeropad2d v0.3 * add zeropad2d v0.3 * add zeropad2d v0.4 * add zeropad2d v0.5 * add zeropad2d v0.5 codestyle * add zeropad2d v0.5 codestyle * add zeropad2d v0.6 functional * add zeropad2d v0.6 functional * add zeropad2d v0.6 functional
-
由 Wangzheee 提交于
-
由 Chen Weihang 提交于
-
由 Leo Chen 提交于
* skip compiled program * fix ut
-
- 24 11月, 2021 19 次提交
-
-
由 piotrekobiIntel 提交于
* Add second batch of deprecated mkldnn namespace and macro changes * Unlock CI * Fix temporary namespace alias placing
-
由 Yuang Liu 提交于
-
由 Zhanlue Yang 提交于
* Added EagerUtils to Eager Dygraph * Purified include dependencies for global_utils * Fixed merge conflicts
-
由 Sing_chan 提交于
-
由 Thunderbrook 提交于
* pybind core * set use psgpu
-
由 Aurelius84 提交于
-
由 Jiawei Wang 提交于
-
由 Leo Chen 提交于
-
由 YuanRisheng 提交于
* elementwise_mul refactor * perfect code in test * delete redundant code * fix bugs when run test_multiply * adjust the location of macro * fix bugs when run ci
-
由 zyfncg 提交于
* add scalar and scalar_array * remove DenseTensor include from Scalar and ScalarArray * remove inner header from scalar_array * refactor the method of fill_constant and add some comment
-
由 Wangzheee 提交于
* matmul_convert_int8 * matmul_convert_int8 * matmulconvert_int8 * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor
-
由 zhaoyingli 提交于
* adapt auto search * adapt auto search * fix matmulv2 compatible * del debug
-
由 tianshuo78520a 提交于
Fix op-benchmark CI
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * Add the local id for devices * Add some comments
-
由 Aurelius84 提交于
-
由 Chen Weihang 提交于
* standarded unittest namespace * fix detail error
-
由 0x45f 提交于
* run dy2stat pure fp16 in Linear model * no use self._pure_fp16_inputs * add test and fix Adam error in dy2stat pure fp16 training * use paddle.optimizer.Adam * run test in gpu * change test time for CI * enlarge atol for test_resnet_pure_fp16 * refine code and enlarge atol * make custom_white_list and custom_black_list take effect for AMP and pure fp16 * check tracer is not None * use default atol * change filter_size * change atol and add some NOTE
-
由 zhupengyang 提交于
-
由 feng_shuai 提交于
-