- 17 10月, 2022 2 次提交
-
-
由 Zhang Zheng 提交于
Optimize performance of depthwise_conv Config: input[2048, 1024, 4, 4], filter[1024, 1, 4, 4], stride=1, pad=0, dilation=1
-
由 Zhang Zheng 提交于
为了提升性能,将label的边界检查从python端转移到kernel内,减少额外op的调用,如min、max和同步拷贝等 当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效,但是当某个label值超出了边界,ignore_index等于该label,这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错,但逻辑上仍是有问题的,且模板参数IgnoreIndex是没有必要的
-
- 13 10月, 2022 2 次提交
-
-
由 傅剑寒 提交于
Fix set_value failure when source tensor is fp16 Dtype and destiny value is a number (dev PR link:#46801)
-
由 Sławomir Siwek 提交于
* Revert pool+grad oneDNN kernel conversion (#45989) * [PHI] transpose2_grad op migration (#46139) * op migrated, Copy(OneDNNContext, ...) added * mutable_data & op registration in fluid removed * refactoring * OneDNNGetDataType to uppercase * missing cpu check added, handler moved to .h file * name changed to transpose_grad * Copy changed back to TensorCopy * Resizing corrected, Copy(OneDNNContext) removed Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com> Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>
-
- 12 10月, 2022 1 次提交
-
-
由 niuliling123 提交于
Cherry-pick 46541 保证Reset50 TSM deeplabv3模型零修改下实现Layout自动调优
-
- 11 10月, 2022 6 次提交
-
-
由 Feiyu Chan 提交于
-
由 Sławomir Siwek 提交于
-
由 Sławomir Siwek 提交于
-
由 Sławomir Siwek 提交于
* [PHI] Migrate gelu kernels (#45596) * gaussian random * mkldnn to onednn renaming * fix merge conflicts * remove fluid code * onednn renaming * gelu fwd * sort activations * gelu gradient * remove unused macros * merge conflicts * fix merge conflicts * remove extra contraint from gelu op * [PHI] relu6_grad kernel (#46501) * Relu6 * remove fluid handler * add individual kernel signature * coding style * replace bounded_relu with clip * whitespace * code style
-
由 Sławomir Siwek 提交于
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
-
由 YuanRisheng 提交于
* fix concat bug * fix ci bugs * fix ci bugs
-
- 10 10月, 2022 5 次提交
-
-
由 Sławomir Siwek 提交于
[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN kernels (#45863) (#46727) * [PHI] Migrate concat+grad, expand+grad, fill_constant, nearest_interp and bilinear_interp oneDNN kernels (#45863) * Migrate concat+grad, expand+grad, fill_constant, nearest_interp_v2 and bilinear_interp_v2 oneDNN kernels to PHI * Remove old namespace variable * Fix invalid out dims error * Add mutable_data method to concat output * Add check for -1 dim before computing out_dims * Capitalize oneDNNGetDataType function name * Change fill_constant kernel to correct PHI kernel * Attempt to fix dims error * Fix fill_constant (full) kernel * update dependencies Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
-
由 Sławomir Siwek 提交于
* [PHI] Migrate sgd and stack oneDNN kernels (#46374) * Convert slice+grad oneDNN fluid kernels to PHI * Change mutable_data to Alloc * Refactor licences * update dependencies Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
-
由 Sławomir Siwek 提交于
* Convert split, pad and pad3d kernels * Convert slice+grad oneDNN fluid kernels to PHI * change out->mutable_data to dev_ctx.Alloc Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
-
由 Sławomir Siwek 提交于
* init * remove softmaxop * merge dev * correct dir * style
-
由 Sławomir Siwek 提交于
* First approach * Shape kernel corrected * Compilation error fixed * Resize corrected * Registered types added * Mistake corrected & types added * sum kernel deleted Co-authored-by: NPaulina Gacek <paulina.gacek.pl@gmail.com>
-
- 29 9月, 2022 3 次提交
-
-
由 傅剑寒 提交于
Add FP16 support for uniform in dygraph mode on Nvidia GPU Dev PR link PR46212
-
由 zyfncg 提交于
* set flag of clip_extra in save_inference_model to true (#46151) * open the clip_extra flag in paddle.static.save_inference_model, test=allcase (#46456) * Open the clip_extra flag in TracedLayer.save_inference_model (#46473) * open the clip_extra flag in paddle.static.save_inference_model, test=allcase * set the defalut value of clip_extra in TracedLayer from False to True, test=allcase * update english doc of paddle.static.save_inference_model, test=document_fix (#46484) * Fix clip_extra logic in remove_training_info (#46534) * fix clip_extra code in remove_training_info * revert rnn opmaker clear
-
由 Lin Manhui 提交于
[CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and InverseFloorDivideFunctor (#45051) (#46504)
-
- 28 9月, 2022 1 次提交
-
-
由 zyfncg 提交于
[cherry-pick] Clear extra attrs of some ops in OpMaker (#46150, #46321, #46418, #46451, #46457) (#46553) * Clear extra attributes of some Op in OpMaker (Part4) (#46060) * clear extra attr of some ops in opmaker * revert clear use_cudnn for pool * fix test_operator_desc * fix Attr interface of OperatorBase * clear extra attrs of condition op in opmaker (#46150) * Clear extra attrs of lookup_table_v2 in OpMaker (#46321) * clear extra attrs of look_up_table_v2 in opmaker * fix bug * clear extra attrs of quantize op in opmaker (#46418) * delete repeated item * clear extra attrs of distribute op in opmaker (#46451) * clear extra atts of sequence_softmax in opmaker (#46457)
-
- 27 9月, 2022 2 次提交
-
-
由 zhaoyingli 提交于
-
由 zyfncg 提交于
* Clear extra attrs of elementwise op in OpMaker (#45845) * clear extra attrs of elementwise op in opmaker * fix op_debug_string_test * fix bug of grad_add * fix sort of runtime attrs * Clear extra attrs of scale in OpMaker (#45984) * clear extra attr of scale in opmaker * fix sum bug * fix merge conflict * fix minus * Clear extra attributes of some Op in OpMaker (Part4) (#46060) * clear extra attr of some ops in opmaker * revert clear use_cudnn for pool * fix test_operator_desc * fix Attr interface of OperatorBase * fix code stype
-
- 26 9月, 2022 1 次提交
-
-
由 Hui Zhang 提交于
* fix sub sign reverse for mkldnn * refactor code as comment * remove useless
-
- 20 9月, 2022 10 次提交
-
-
由 houj04 提交于
* [XPU] update xdnn activations. (#46246) * [XPU] update xpu cmake. test=kunlun
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 Jiabin Yang 提交于
* [Eager] Fix ocr (#46124) * fix linspace error in amp * fix log * fix amp error * fix ocr error which caused by amp * add more check * rename dtype ns * [Eager Bug fix]Fix Detection (#46147) * fix linspace error in amp * fix log * fix amp error * Revert "Simplify size op impl (#45808)" This reverts commit c252b1de. * fix_seg * fix detection Co-authored-by: NChen Weihang <sunny_cwh@163.com> Co-authored-by: NChen Weihang <sunny_cwh@163.com>
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Cherry-pick of PR 46045 * Fix bug of reduce_sum kp op. * Fix bug of reduce_sum kp operator compilation. If compilation device is XPU, eigen kernel should be ignored.
-
由 WangZhen 提交于
* Fix TransDataBackend Error when call unsqueeze using MKL Tensor * Add UT * Refine UT
-
由 zhangkaihuo 提交于
cherry-pick : #46016, #46021, #45974 * [Sparse]Sparse add support gpu (#45974) * [Sparse]Remove unused code (#46021) * [Sparse] Add infer meta (#46016)
-
由 Jiabin Yang 提交于
* fix linspace error in amp * fix log * fix amp error
-
由 Charles-hit 提交于
* support cast op backward refuse forward and fix some bugs (#46173) * support cast op backward refuse forward * Fix the bug of high order unit test framework * support sign op backward refuse forward (#46002)
-
由 niuliling123 提交于
cherry-pick from #45826 LayoutAutotune 支持 inplace 类型的OP 根据 Add eager layout autotune #45409 修改意见调整UseAutotune 将LayoutAutotune判断放到controller中,与AMP 判断保持一致
-
由 zyfncg 提交于
* fix wrong eigen header include * fix complie bug * fix nan_inf_utils_detail * fix resource_manager * fix conv_miopen_helper
-
- 19 9月, 2022 7 次提交
-
-
由 RichardWooSJTU 提交于
[vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) (#46193) * fix return order error and duplicate results with specific inputs
-
由 weishengying 提交于
-
由 Charles-hit 提交于
* add unit test for sum higher level op (#45961) * support slice op backward refuse forward and add high level unit test (#45960) * support tile op backward refuse forward (#45942) * support expand_v2 op backward refuse forward (#45941) * support concat backward refuse forward (#45940)
-
由 Jiabin Yang 提交于
* [PHI] Support bmm and bmm_grad in xpu (#45887) * support bmm and bmm_grad in xpu * add error removal * test=kunlun * refactor code for better structure * test=kunlun * add fp16 kernel for bmm * test=kunlun * test=kunlun
-
由 minghaoBD 提交于
Co-authored-by: NRichardWooSJTU <37864677+RichardWooSJTU@users.noreply.github.com>
-
由 sneaxiy 提交于
-
由 Chen Weihang 提交于
This reverts commit c252b1de.
-