- 10 1月, 2022 4 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * profiler skeleton * update * update * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 taixiurong 提交于
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
由 Chen Weihang 提交于
* unify infer_shape func calling * support set grad infer shape fn for custom op * unify infershape in new executor and eager * remove todo comment * revert infershape in operator
-
- 07 1月, 2022 2 次提交
-
-
由 YuanRisheng 提交于
* refactor flatten grad kernel * fix bugs when run ci unittest * fix bugs when use default GetExpectedPtenKernelArgs * xshape sometimes is has null holder ,fix this bugs
-
由 Leo Chen 提交于
-
- 05 1月, 2022 2 次提交
-
-
由 joanna.wozna.intel 提交于
-
由 joanna.wozna.intel 提交于
* Quantize nearest_interp and nearest_interp_v2 * Check if avx_core supported * Add depthwise_conv2d to supported quantization list
-
- 04 1月, 2022 4 次提交
-
-
由 Qi Li 提交于
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #3]Port framework::Tensor members & interfaces to pten::DenseTensor, test=allcases (#38473) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes
-
由 yaoxuefeng 提交于
heter context support dynamic mf dim
-
- 31 12月, 2021 5 次提交
-
-
由 baoachun 提交于
* add mul_gru_fuse_pass ut * update ut * update ut * update ut timeout setting * update ut
-
由 zmxdream 提交于
-
由 fwenguang 提交于
* [MLU]support calling mlu op from python interface * [MLU]fix * fix * [mlu]fix mlu_places * [mlu]fix required mlu * fix * [MLU]fix tensor copy * [mlu] fix MLUPlace call path
-
由 Zhanlue Yang 提交于
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
- 30 12月, 2021 6 次提交
-
-
由 joanna.wozna.intel 提交于
-
由 Feng Xing 提交于
This PR adds runtime flags run_kp_kernel, which choose which op to run for xpu2. There are two: dynamic linked and built from kp.
-
由 Yulong Ao 提交于
* [Auto parallel] Make the id of var and op unique * [Auto Parallel] Rename back dist_context to distop_context
-
由 wenbin 提交于
* dynamic shape clone supported
-
由 xiongkun 提交于
* fix wait for tiexing * fix work2vec model. new_exe support EOF Exception in ReadOp now
-
由 Chen Weihang 提交于
* remove offset in storage * revert api change * fix custom op slice bug * fix mutable_data error
-
- 29 12月, 2021 3 次提交
-
-
由 yaoxuefeng 提交于
add hashtable dynamic mf support
-
由 yaoxuefeng 提交于
add dynamic mf size api
-
由 JZ-LIANG 提交于
* auto parallel sharding base * chmod * add unitest * set unitest cmake dist label * revise code according to rewiew * chmod
-
- 28 12月, 2021 3 次提交
-
-
由 From00 提交于
* fix reshape move storage error * remove needless set type * alloc tensor by shared storage * Utilize StreamSafeCUDAAllocator to support fast GC in new executor * Fix compile error for Windows and ROCm * Fix compile error for Windows * Modify UT stream_safe_cuda_alloc_test * Modify UT stream_safe_cuda_alloc_test * Rewrite fast GC * Rewrite fast GC * Fix compile error for BOOST_GET_CONST * Fix compile error for BOOST_GET_CONST * Changes default stream for StreamSafeCUDAAllocator * Fix a small CI error * Remove some redundant code * Fix conflict * Fix compile error for ROCm * Fix Windoes CI error * Fix CI error * Remove some unnecessary code * Fix CI error * Add UT for fast GC * Fix CI error * add device-agnostic stream class * add stream.h * fix ut * fix cpu compile * Use RWLock in GetAllocator * Fix CI error Co-authored-by: NChen Weihang <chenweihang@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 Leo Chen 提交于
* add completion_nofifier * fix bug * unregist event waiter
-
由 baoachun 提交于
* add mul_lstm_fuse_pass ut * update mul_lstm_fuse_pass ut * update ut * update ut * update ut * add CPU ut cmake setting * update ut
-
- 27 12月, 2021 4 次提交
- 26 12月, 2021 1 次提交
-
-
由 Zhanlue Yang 提交于
* Replaced pten::LoD with paddle::framework::LoD * Overrided CPUVector with CUDAVector * Refactored paddle::framework::Vector
-
- 24 12月, 2021 1 次提交
-
-
由 baoachun 提交于
* add conv+hard_sigmoid fuse pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * update conv_hard_sigmoid_mkldnn_fuse_pass ut * update conv+hard_sigmoid and conv+hard_swish fuse pass ut * update ut * update ut
-
- 23 12月, 2021 5 次提交
-
-
由 Jacek Czaja 提交于
* First set of fixes * - Make more likely to GetBlob find a blobs * - Lint
-
由 yaoxuefeng 提交于
add mem pool
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update EventsWater * fix * split workqueue files * add more tests * fix * bugfix * bugfix * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 baoachun 提交于
* add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut * update mkldnn conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * restrict conv2d data_format in conv_elementwise_add_mkldnn_fuse_pass * update conv_elementwise_add_mkldnn_fuse_pass OpCompat * update conv_elementwise_add_mkldnn_fuse_pass ut * update ut
-
由 zhouweiwei2014 提交于
* add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector * fix comment
-