- 10 1月, 2022 17 次提交
-
-
由 Haohongxiang 提交于
* add lstsq gpu kernel * update * add docs_en * modify ut * fix bugs * modify example in docs_en * remove lstsq_op.cu from ROCM cmake * modify docs_en * modify docs_en * modify docs_en * remove unneccessary TensorCopy
-
由 LiYuRio 提交于
-
由 Yuang Liu 提交于
-
由 baoachun 提交于
* refactor the forward implementation of reshape npu op * update reshape npu op * update reshape npu op
-
由 Chen Weihang 提交于
-
由 Yulong Ao 提交于
* Add the backward support for QR * Remove unnecessary comments
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues * Removed interfaces & members from lod_tensor,test=allcases
-
由 shangliang Xu 提交于
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * profiler skeleton * update * update * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 taixiurong 提交于
-
由 wangxinxin08 提交于
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
由 Chen Weihang 提交于
* unify infer_shape func calling * support set grad infer shape fn for custom op * unify infershape in new executor and eager * remove todo comment * revert infershape in operator
-
由 LiYuRio 提交于
-
由 andyjpaddle 提交于
* add maxunpool3d op * update doc for maxunpool3d op * update doc for maxunpool3d op * update doc for maxunpool3d op * update sample code for maxunpool3d * add maxunpool1d op * update some code for maxunpool1d
-
由 Guoxia Wang 提交于
-
由 Guoxia Wang 提交于
-
- 07 1月, 2022 7 次提交
-
-
由 YuanRisheng 提交于
* refactor flatten grad kernel * fix bugs when run ci unittest * fix bugs when use default GetExpectedPtenKernelArgs * xshape sometimes is has null holder ,fix this bugs
-
由 wangxinxin08 提交于
* add mish operator and api * remove redundant code and modify grad_atol of mish unittest * modify mish code to be consistent with other activation implementation
-
由 zhangbo9674 提交于
* add multi tensor for adam * add merged_adam op * refine code * refine adam compute logic
-
由 niuliling123 提交于
-
由 LiYuRio 提交于
-
由 Leo Chen 提交于
-
由 Li Min 提交于
* Add fp16 support for scale/bias for fused_layernnorm_residual_dropout_bias op.
-
- 06 1月, 2022 12 次提交
-
-
由 Leo Chen 提交于
-
由 YuanRisheng 提交于
* move mid api and rename kernel * use empty kernel
-
由 Thomas Young 提交于
-
由 chentianyu03 提交于
* move eigen/reduce.h imple into cpu/reduce.h * ctx to dev_ctx
-
由 wanghuancoder 提交于
-
由 Zhanlue Yang 提交于
* Handled special sum_grad_op code gen in Eager Dygraph * Fixed merge issues
-
由 wenbin 提交于
* bug fix * remove blank
-
由 limingshu 提交于
* fix the wrong filename * first commit
-
由 zyfncg 提交于
* adjust the full kernel * remove creation.h * use Empty to create tensor in full
-
由 YuanRisheng 提交于
* move gpu_impl of elementwise kernel * change copyright to 2022
-
由 jakpiase 提交于
* added exp activation and use_dst_for_bwd kernels * CI RERUN * minor change
-
- 05 1月, 2022 4 次提交
-
-
由 Lijunhui 提交于
* init commit: new elem_mul_grad * add template speciallization for complex in multiply * reply review comments * correct dx and dy computation when T is complex * reply review comments * update to new ReduceRunctor * mul-output broadcast * call functions * call functions with comments * remove comments
-
由 From00 提交于
* Fix bug of GetAllocatorInterfaceTest * Replace some shared_ptr with unique_ptr * Change Alloc call
-
由 joanna.wozna.intel 提交于
-
由 TTerror 提交于
-