- 10 10月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* First approach * Shape kernel corrected * Compilation error fixed * Resize corrected * Registered types added * Mistake corrected & types added * sum kernel deleted Co-authored-by: NPaulina Gacek <paulina.gacek.pl@gmail.com>
-
- 29 9月, 2022 2 次提交
-
-
由 傅剑寒 提交于
Add FP16 support for uniform in dygraph mode on Nvidia GPU Dev PR link PR46212
-
由 Lin Manhui 提交于
[CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and InverseFloorDivideFunctor (#45051) (#46504)
-
- 27 9月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
-
- 20 9月, 2022 6 次提交
-
-
由 houj04 提交于
* [XPU] update xdnn activations. (#46246) * [XPU] update xpu cmake. test=kunlun
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 Jiabin Yang 提交于
* [Eager] Fix ocr (#46124) * fix linspace error in amp * fix log * fix amp error * fix ocr error which caused by amp * add more check * rename dtype ns * [Eager Bug fix]Fix Detection (#46147) * fix linspace error in amp * fix log * fix amp error * Revert "Simplify size op impl (#45808)" This reverts commit c252b1de. * fix_seg * fix detection Co-authored-by: NChen Weihang <sunny_cwh@163.com> Co-authored-by: NChen Weihang <sunny_cwh@163.com>
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Cherry-pick of PR 46045 * Fix bug of reduce_sum kp op. * Fix bug of reduce_sum kp operator compilation. If compilation device is XPU, eigen kernel should be ignored.
-
由 zhangkaihuo 提交于
cherry-pick : #46016, #46021, #45974 * [Sparse]Sparse add support gpu (#45974) * [Sparse]Remove unused code (#46021) * [Sparse] Add infer meta (#46016)
-
由 zyfncg 提交于
* fix wrong eigen header include * fix complie bug * fix nan_inf_utils_detail * fix resource_manager * fix conv_miopen_helper
-
- 19 9月, 2022 4 次提交
-
-
由 RichardWooSJTU 提交于
[vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) (#46193) * fix return order error and duplicate results with specific inputs
-
由 Jiabin Yang 提交于
* [PHI] Support bmm and bmm_grad in xpu (#45887) * support bmm and bmm_grad in xpu * add error removal * test=kunlun * refactor code for better structure * test=kunlun * add fp16 kernel for bmm * test=kunlun * test=kunlun
-
由 sneaxiy 提交于
-
由 Chen Weihang 提交于
This reverts commit c252b1de.
-
- 15 9月, 2022 1 次提交
-
-
由 WangZhen 提交于
Support 0 shapes input Tensor for MKL slice kernel
-
- 14 9月, 2022 2 次提交
-
-
由 engineer1109 提交于
修复cuda11.7编译出错的问题
-
由 ykkk2333 提交于
-
- 13 9月, 2022 1 次提交
-
-
由 JingZhuangzhuang 提交于
-
- 09 9月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
* add fusion dir and fuse_softmax_mask kernel * remove fusion kernel dir * migrate infershape * fix code errror
-
由 xiaoguoguo626807 提交于
* modify slice infershape * code style * modify slice_unittest
-
由 Chen Weihang 提交于
* simplify size op * trans to cuda manuly * fix copy error
-
- 08 9月, 2022 2 次提交
-
-
由 piotrekobi 提交于
* gaussian random * mkldnn to onednn renaming * fix merge conflicts * remove fluid code * onednn renaming * Move classes from mkldnn_reuse.h to onednn_reuse.h * Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI * Refactor grad kernels into separate files * Fix CI failures * Fix Codestyle * Implement reviewer suggestions * Add new lines after includes for readability Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
-
由 Leo Guo 提交于
-
- 07 9月, 2022 9 次提交
-
-
由 Chen Weihang 提交于
* add save kernel * add save_sr_kernel * remove original save_op * add save gpu kernel * remove combine kernel * add port.h include * add save selected rows test * remove useless kernel.h
-
由 houj04 提交于
* [XPU] update xdnn to 0906. test=kunlun * [XPU] update xdnn to 0907. test=kunlun
-
由 piotrekobi 提交于
* gaussian random * mkldnn to onednn renaming * fix merge conflicts * Migrate reduce_op oneDNN kernels to phi * Remove unnecessary header * remove fluid code * onednn renaming * Change std::vector<int64_t> to IntArray * Fix code style * Move classes from mkldnn_reuse.h to onednn_reuse.h * Move more functions from mkldnn_helper.h to onednn_helpper.h * Change MKLDNN to OneDNN in VLOG message * Implement reviewer suggestions Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
-
由 WangZhen 提交于
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
-
由 houj04 提交于
-
由 limingshu 提交于
* first commit * merged with develop * merged with develop * fix merge sequential one dims bugs
-
由 Sławomir Siwek 提交于
* scale kernel * endline * add inplace * fix merge conflicts * Merge conflicts
-
由 zhangkaihuo 提交于
-
由 sneaxiy 提交于
* fix amp kernel * update to remove PADDLE_WITH_XPU macro
-
- 06 9月, 2022 8 次提交
-
-
由 YuanRisheng 提交于
* add tensor array * fix ci bugs * fix ci bugs * fix ci bugs * fix ci bugs * update by comment * update code
-
由 ykkk2333 提交于
-
由 ykkk2333 提交于
-
由 Weilong Wu 提交于
[Eager, Performance optimization] reduce_all interface move reduce_all flag from python to C++ (#45744) * [Eager, Performance optimization] move reduce_all flag from python to c++ * polish reduce_all * fix ci error * fix errors
-
由 Weilong Wu 提交于
* [Eager, Performance optimization] reduce_max / min polish * polish reduce_max / min * update min/max kernel reduce_all logic * fix a mistake * fix ci errors * fix errors
-
由 xiaohemaikoo 提交于
-
由 LielinJiang 提交于
* fix grad error of grounorm op when cuda version==11.7
-
由 Chen Weihang 提交于
-