- 07 1月, 2022 4 次提交
-
-
由 YuanRisheng 提交于
* refactor flatten grad kernel * fix bugs when run ci unittest * fix bugs when use default GetExpectedPtenKernelArgs * xshape sometimes is has null holder ,fix this bugs
-
由 wangxinxin08 提交于
* add mish operator and api * remove redundant code and modify grad_atol of mish unittest * modify mish code to be consistent with other activation implementation
-
由 zhangbo9674 提交于
* add multi tensor for adam * add merged_adam op * refine code * refine adam compute logic
-
由 Li Min 提交于
* Add fp16 support for scale/bias for fused_layernnorm_residual_dropout_bias op.
-
- 06 1月, 2022 7 次提交
-
-
由 YuanRisheng 提交于
* move mid api and rename kernel * use empty kernel
-
由 Thomas Young 提交于
-
由 limingshu 提交于
* fix the wrong filename * first commit
-
由 zyfncg 提交于
* adjust the full kernel * remove creation.h * use Empty to create tensor in full
-
由 YuanRisheng 提交于
* move gpu_impl of elementwise kernel * change copyright to 2022
-
由 jakpiase 提交于
* added exp activation and use_dst_for_bwd kernels * CI RERUN * minor change
-
- 05 1月, 2022 7 次提交
-
-
由 Lijunhui 提交于
* init commit: new elem_mul_grad * add template speciallization for complex in multiply * reply review comments * correct dx and dy computation when T is complex * reply review comments * update to new ReduceRunctor * mul-output broadcast * call functions * call functions with comments * remove comments
-
由 TTerror 提交于
-
由 wangxinxin08 提交于
-
由 chentianyu03 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * fix compile bugs * move reduce files by new rule * add set header * format code style * merge develop and fix conflict * merge develop and fix conflict Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
-
由 jakpiase 提交于
* fix for matmul_v2 broadcasting * fix for output shape not broadcasted
-
由 TTerror 提交于
* add huber_loss for kunlun * update xpu.cmake * update unitests * update unitests * update elementwise_add * update elementwise_add * update elementwise_add
-
由 crystal 提交于
* add elementwise div * move mul and div grad functor * Combine multiple CUDA kernels * Update the reduce interface call * add multi-output * add multi-output div * add branch judge * Package branch * Combine the x and y functions into one
-
- 04 1月, 2022 7 次提交
-
-
由 niuliling123 提交于
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
-
由 Qi Li 提交于
-
由 Aurelius84 提交于
* Fix memcpyD2H sync behavior with other stream * add wait * add wait * add wait
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
由 furnace 提交于
[NPU] add pad and pad_grad
-
由 jakpiase 提交于
-
由 Chen Weihang 提交于
* move inner cast api to cast_kernel.h * resolve conflit
-
- 31 12月, 2021 9 次提交
-
-
由 Zhangjingyu06 提交于
* [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun,*test=kunlun Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
-
由 JYChen 提交于
* add new api/op kthvalue * kthvalue cuda kernel to cub sorting * fix example code error * throw errors instead of LOG in cuda sort * throw errors by Paddle_ENFORCE
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * fix compile bugs
-
由 zhiboniu 提交于
-
由 xiaoting 提交于
* add fold opereators, test=develop * add fold opereators, test=develop * add fold opereators, test=develop * update fold op error test, test=develop * fix unitext, test=develop * fix unitext, test=develop
-
由 Huihuang Zheng 提交于
Paddle new APIs: put_along_axis. Xu Huang is on holiday so we created this PR to work on it. It is based on his PR: https://github.com/PaddlePaddle/Paddle/pull/37921
-
由 zhiboniu 提交于
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs
-
- 30 12月, 2021 6 次提交
-
-
由 zhiboniu 提交于
LGTM
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl1. test=kunlun * add sigmoid cross entropy with logits to kl1. test=kunlun
-
由 zhangyk0314 提交于
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
-
由 JYChen 提交于
* add new OP mode * rename trans-variable name and fix UT
-
由 Haohongxiang 提交于
* add cpu kernel of lstsq * update * modify code style * modify unittest * remove support for complex
-
由 zhangkaihuo 提交于
将cuSparse的handle与DeviceContext进行绑定,避免op中进行创建和销毁 添加对cuSparse中dense和sparse转换的API进行封装 添加对封装的API的单测
-