- 30 9月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * add bfloat16 to selu_grad to pass CI * fix selu grad compilation error
-
- 29 9月, 2022 7 次提交
-
-
由 carryyu 提交于
-
由 Zhang Zheng 提交于
* Move valid check from python to kernel * fix error throw * fix * invalid label check * fix * Revert "fix" This reverts commit 79fad6799cfa4b30423dbc84e67d7d843d22b84a. * Revert "invalid label check" This reverts commit 402a9707390ad5386b3222e85844b92d2e9b9fa4. * Revert "fix" This reverts commit 09ba3080ee0587447f875c19cdf060485f15ae3b. * Revert "fix error throw" This reverts commit a901bfcc2179d5c120ec29af766f392b122dab52. * Revert "Move valid check from python to kernel" This reverts commit baa03cc4ef82d8d45516c30dfb52bf5aead30748. * final fix * fix * fix
-
由 Leo Guo 提交于
Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)
-
由 carryyu 提交于
* fix P40 topk: Make the optimized topk compatible with P40. * fix P40 topk: Make the optimized topk compatible with P40. * fix P40 topk: Make the optimized topk compatible with P40.
-
由 ming1753 提交于
-
由 傅剑寒 提交于
-
由 houj04 提交于
* [XPU] update xpu cmake to 0923. test=kunlun * [XPU] update xpu cmake to 0928. test=kunlun
-
- 28 9月, 2022 5 次提交
-
-
由 Chen Weihang 提交于
* remove needless using tensor * remove needless using tensor * resolve conflict * replace tensor using * fix format error * revert needless changing * fix rocm and npu compile error * fix cinn compile error * fix format error * fix mkldnn format error * fix mkldnn format error * fix cinn compile error * fix cinn compile error * fix cinn compile error * resolve conflict
-
由 limingshu 提交于
-
由 Sławomir Siwek 提交于
* Relu6 * remove fluid handler * add individual kernel signature * coding style * replace bounded_relu with clip * whitespace * code style
-
由 YuanRisheng 提交于
* fix concat bug * fix ci bugs * fix ci bugs
-
由 kangguangli 提交于
* add gpu kernel for transfer layout * comment error throw * fix: flag setting in testcase; add condition check for raising error * fix typo * fix: add error type for PADDLE_THROW * remove kernel fallback in data_transfer.cc * remove useless variable definition
-
- 27 9月, 2022 2 次提交
-
-
由 Leo Guo 提交于
-
由 zhangkaihuo 提交于
-
- 26 9月, 2022 2 次提交
-
-
由 zhaoyingli 提交于
-
由 Lin Manhui 提交于
-
- 23 9月, 2022 4 次提交
-
-
由 Zhang Zheng 提交于
* Optimize performance of depthwise_conv_fwd * fix
-
由 dongfangshenzhu 提交于
* add phi reduce_sum test=kunlun * add fhi reduce_sum test=kunlun * add fhi reduce_sum test=kunlun
-
由 YuanRisheng 提交于
-
由 limingshu 提交于
* first commit * clarify the quotes * change code style format * support bfloat16
-
- 22 9月, 2022 5 次提交
-
-
由 Paulina Gacek 提交于
* Sum kernel migrated to phi * Static cast added, file name changed * OneDNNGetDataType to uppercase * refactoring * AddOneDNNHandler changed to SumOneDNNHandler
-
由 Piotr Paturej 提交于
* Convert slice+grad oneDNN fluid kernels to PHI * Change mutable_data to Alloc * Refactor licences
-
由 Sławomir Siwek 提交于
* gaussian random * mkldnn to onednn renaming * fix merge conflicts * remove fluid code * onednn renaming * gelu fwd * sort activations * gelu gradient * remove unused macros * merge conflicts * fix merge conflicts * remove extra contraint from gelu op
-
由 carryyu 提交于
* Optimize topk's performance when k is small and input_width is large * 修改blockdim设置逻辑 * Update top_k_function_cuda.h
-
由 limingshu 提交于
* first commit * clarify the quotes * change code style format * rerun for ci
-
- 21 9月, 2022 9 次提交
-
-
由 ccrrong 提交于
* add fp16 support * update * update half * code format * fix unittest * fix rocm compile error * code format * code format * fix rocm compile error * fix rocm compile error
-
由 Piotr Paturej 提交于
-
由 zhangkaihuo 提交于
This reverts commit e8de9dfd.
-
由 Zhen Wang 提交于
* use cinn in the paddle inference * fix some cmake errors * Avoid division by zero in the arange_kernel. * Avoid dynamic ops. * Remove some useless codes. * Use OpTransInfo to encapsulate some codes used in the build_cinn_pass.
-
由 zhangkaihuo 提交于
* sort out index
-
由 zhangkaihuo 提交于
* for add_bias
-
由 ykkk2333 提交于
* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun * migrate add_n kernep to phi, test=kunlun
-
由 5u13 提交于
-
由 Piotr Paturej 提交于
[PHI] Migrate concat+grad, expand+grad, fill_constant, nearest_interp and bilinear_interp oneDNN kernels (#45863) * Migrate concat+grad, expand+grad, fill_constant, nearest_interp_v2 and bilinear_interp_v2 oneDNN kernels to PHI * Remove old namespace variable * Fix invalid out dims error * Add mutable_data method to concat output * Add check for -1 dim before computing out_dims * Capitalize oneDNNGetDataType function name * Change fill_constant kernel to correct PHI kernel * Attempt to fix dims error * Fix fill_constant (full) kernel
-
- 20 9月, 2022 5 次提交
-
-
由 5u13 提交于
-
由 Ouyang Chao 提交于
* optimize adaptive_pooling_op (forward) * fix bug of AdaptiveKernelMaxPool2dWithIdx * fix bug of AdaptiveKernelPool2D
-
由 Sławomir Siwek 提交于
* init * remove softmaxop * merge dev * correct dir * style
-
由 Piotr Paturej 提交于
* Convert split, pad and pad3d kernels * Convert slice+grad oneDNN fluid kernels to PHI * change out->mutable_data to dev_ctx.Alloc
-
由 Paulina Gacek 提交于
* First approach * Shape kernel corrected * Compilation error fixed * Resize corrected * Registered types added * Mistake corrected & types added * sum kernel deleted
-