- 13 1月, 2022 4 次提交
-
-
由 chentianyu03 提交于
* move dot_dev api into dot_kernel.h * add infermate header * modify to dotkerel in dot_op.h * mvoe conj dev api into complex_kernel.h * move sign dev api into sign_kernel.h * move scale dev api into kernel.h and remove infermete.h * rm paddle/pten/include/math.h * rm paddle/pten/include/math.h * rm include dir * rm paddle/pten/include/math.h * fix conflict with develop branch * rm devContext in conj_op.h * add the missing complex_kernel header
-
由 jakpiase 提交于
* base changes for mul reimplementation * empty commit * tmp save * full implementation of mul bf16/fp32 fwd bwd * CI fix * CI rerun * changed unity build cmake to avoid gpu issues * removed mul mkldnn from unity build * added skipping tests if not cpu_bf16 * CI fix * CI fix * CI fix
-
由 Chen Weihang 提交于
* fix mkldnn invalid infershape * add unittest for mkldnn in new executor * add import os
-
由 石晓伟 提交于
-
- 12 1月, 2022 13 次提交
-
-
由 Zhang Ting 提交于
* code clean * [part 3]change type of function args
-
由 Sylwester Fraczek 提交于
* fix conv act int8 scale * add unit test for conv+hard_swish
-
由 xiaoting 提交于
* support 5d for nearest * update nearest3d unittest, test=develop * fix approve ci, test=develop * fix approve ci, test=develop
-
由 Lijunhui 提交于
* init elem_max_grad op * optimize code and reply review comments * ternary functors * apply new reduce func * move functor to .h * multi-outputs init * rearrange code * modifed functors * optimizer code * pass nullptr * revert the last change as seg fault occurs * optimize code * remove inplace * remove comments
-
由 Chen Weihang 提交于
* remove hybird dir * resolve conflit
-
由 Lijunhui 提交于
* ini commit * multi-outputs init commit * optimize code * remove inplace
-
由 Zhang Ting 提交于
-
由 chentianyu03 提交于
* move dot_dev api into dot_kernel.h * add infermate header * modify to dotkerel in dot_op.h * mvoe conj dev api into complex_kernel.h * move sign dev api into sign_kernel.h
-
由 YuanRisheng 提交于
* refactor the impl of elementwise grad kernel * refactor impl of elementwise grad kernel(cuda) * fix compile bugs
-
由 Zhang Ting 提交于
-
由 Zhang Ting 提交于
-
由 Zhang Ting 提交于
-
由 limingshu 提交于
* first commit * fix wrong filename * fix the wrong spell name * fix gpu config warper * modify according to pr advices * fix GpuLauchConfig1D api bugs * change the config for dropout grad * fix bugs * modification according to pr advices * modification according to pr advices
-
- 11 1月, 2022 6 次提交
-
-
由 YuanRisheng 提交于
-
由 zyfncg 提交于
* refactor matmul directory in pten * fix merge conflict * add dot_grad kernel * add dot_grad kernel in pten * add matmul_grad kernel * update the code * delete useless code in fluid * fix some bug of running matmul grad kernel * fix merge conflict * refactor some code * refactor code
-
由 Zhang Zheng 提交于
* fix bug when inplace strategy * fix * fix * fix * fix * fix
-
由 niuliling123 提交于
-
由 limingshu 提交于
* fix the wrong filename * first commit * first commit * remove rest useless headers * for ci approval
-
由 Sing_chan 提交于
* support vs2019 compilation in windows * not modify pow_op's original compute logic
-
- 10 1月, 2022 11 次提交
-
-
由 Haohongxiang 提交于
* add lstsq gpu kernel * update * add docs_en * modify ut * fix bugs * modify example in docs_en * remove lstsq_op.cu from ROCM cmake * modify docs_en * modify docs_en * modify docs_en * remove unneccessary TensorCopy
-
由 baoachun 提交于
* refactor the forward implementation of reshape npu op * update reshape npu op * update reshape npu op
-
由 Chen Weihang 提交于
-
由 Yulong Ao 提交于
* Add the backward support for QR * Remove unnecessary comments
-
由 shangliang Xu 提交于
-
由 taixiurong 提交于
-
由 wangxinxin08 提交于
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
由 andyjpaddle 提交于
* add maxunpool3d op * update doc for maxunpool3d op * update doc for maxunpool3d op * update doc for maxunpool3d op * update sample code for maxunpool3d * add maxunpool1d op * update some code for maxunpool1d
-
由 Guoxia Wang 提交于
-
由 Guoxia Wang 提交于
-
- 07 1月, 2022 4 次提交
-
-
由 YuanRisheng 提交于
* refactor flatten grad kernel * fix bugs when run ci unittest * fix bugs when use default GetExpectedPtenKernelArgs * xshape sometimes is has null holder ,fix this bugs
-
由 wangxinxin08 提交于
* add mish operator and api * remove redundant code and modify grad_atol of mish unittest * modify mish code to be consistent with other activation implementation
-
由 zhangbo9674 提交于
* add multi tensor for adam * add merged_adam op * refine code * refine adam compute logic
-
由 Li Min 提交于
* Add fp16 support for scale/bias for fused_layernnorm_residual_dropout_bias op.
-
- 06 1月, 2022 2 次提交
-
-
由 YuanRisheng 提交于
* move mid api and rename kernel * use empty kernel
-
由 Thomas Young 提交于
-