- 27 11月, 2021 2 次提交
-
-
由 JingZhuangzhuang 提交于
-
由 Aganlengzi 提交于
* [NPU] reorganization for device API abstraction * [NPU] delete old files * [NPU] fix npu_collective_helper * [NPU] fix collective_helper * [NPU] fix ut * [NPU] mod memory allocation and hccl_helper * [NPU] fix place_type * [NPU] split enfoce.h * move acl* call into npu_info * merge conflict * fix merge * merge conflict * merge conflict
-
- 26 11月, 2021 22 次提交
-
-
由 zmx 提交于
-
由 Zhanlue Yang 提交于
-
由 Steffy-zxf 提交于
* fix data parallel when VOCAB var in program
-
由 YUNSHEN XIE 提交于
-
由 wanghuancoder 提交于
* clear local scope every setp, test=develop * refine,test=develop * refine, test=develop
-
由 wanghuancoder 提交于
-
由 zhaocaibei123 提交于
* test * test * rm test * update * update * update * add unittest * update * update save
-
由 Chen Weihang 提交于
-
由 Zhanlue Yang 提交于
-
由 YuanRisheng 提交于
* Support parse kernel key by multi-inputs * optimize code according to reviewer
-
由 Li Min 提交于
* Fix bugs when bias is none for static graph for fused_attention op.
-
由 Zhanlue Yang 提交于
reset_inplace_version removes all inplace related records to VarBase/VariableWrapper, the essential purpose of which is to let you use inplace operations as if using its non-inplaced version, which of course will cause unexpected consequences if not used with care. This is essentially a hack interface to satisfy one specific request
-
由 Yuang Liu 提交于
-
由 wangzhen38 提交于
* add tdm sample * add tdm sample in c++ * update tdm sample * modify sample count * fix conflict * add set_date * fix cmake error * fix bug of proto * update index_dataset proto * update cmake * fix error cmake * fix cmake mkldnn * fix cmake proto * update cmake proto * update cmake * update rec * update dataset * update dataset * update dataset * updata dataset * updata dataset * updata coverage * updata ci * goback4 * fix npu ci * add xxhash dep
-
由 smallv0221 提交于
* fix dropout static when axis != None * update dropout test * add dropout test * fix test * Update test_dropout_op.py * Update test_dropout_op.py * fix testcase * fix testcase * Update test_dropout_op.py * fix testcase * fix testcase * optimize perf * add new test * fix testcase
-
由 zyfncg 提交于
-
由 Chen Weihang 提交于
-
由 Yuang Liu 提交于
-
由 Sing_chan 提交于
* block xxhash warning of c4711 * modify according to zhouwei's comment * fix syntax error
-
由 Zhanlue Yang 提交于
-
由 Zhanlue Yang 提交于
-
由 Zhanlue Yang 提交于
* Added GradTensorHolder to Eager Dygraph * Added accumulation codes to Eager Dygraph * Added tensor utils to Eager Dygraph * Resolve compilation issues * Fixed issues
-
- 25 11月, 2021 16 次提交
-
-
由 Sing_chan 提交于
* make third_party's cmake get source code directly 2 * modify according to zhouwei's comment * eager needs mkldnn to compile
-
由 zyfncg 提交于
* add scalar and scalar_array * remove DenseTensor include from Scalar and ScalarArray * remove inner header from scalar_array * refactor the method of fill_constant and add some comment * add fill_constant kernel using ScalarArray * modify some prompt * remove fill_constant kernel with no shape
-
由 furnace 提交于
* [NPU] add int64 support for argsort op * [NPU] delete debug codes
-
由 furnace 提交于
* [NPU] add NPU kernel for prior_box op * [NPU] delete debug codes
-
由 Yiqun Liu 提交于
-
由 Zhen Wang 提交于
-
由 wuhuanzhou 提交于
-
由 Baibaifan 提交于
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * [heterps]bug fix for _run_from_dataset * fix heter_server.cc * fix launch_utils.py * fix heter_section_worker.cc * fix. test=develop * fix. test=develop
-
由 From00 提交于
* Support multi-stream allocation for CUDA place * Do not notify the retrying from other streams when free CUDA allocation * Fix compile error for CPU * Fix compile error for HIP * Release memory for StreamSafeCUDAAllocaRetry in malloc_test * Add FLAGS_use_stream_safe_cuda_allocator * Fix CI error for 'set_tests_properties' * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator * Add UT for alloc interface * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
-
由 Sing_chan 提交于
* block unknown option /arch:SSE3 * modify according to zhouwei's comment
-
由 zhouweiwei2014 提交于
* add new API paddle.nn.initializer.Dirac * fix doc
-
由 Leo Chen 提交于
* fix program cache key * bug fix * fix cache problem * remove unused code
-
由 WangXi 提交于
-
由 tianshuo78520a 提交于
* Fix static-ci
-
由 Zhanlue Yang 提交于
* Added GradTensorHolder to Eager Dygraph * Added accumulation codes to Eager Dygraph * Fix windows-ci issue * Fix NPU-CI issue * Fixed CI-Coverage issue
-