- 11 10月, 2022 1 次提交
-
-
由 niuliling123 提交于
-
- 10 10月, 2022 5 次提交
-
-
由 YuanRisheng 提交于
* add yaml entry for rnn and rrnn_grad, move infershape function for rnn_grad to phi infer meta * WIP: move rnn kernrl to phi * Change the code generation to avoid converting from intializer list to tuple of heterogeneous types. This is only triggered when an api has intermediate outputs, and the result of the outputs are of heterogeneous types. * fix the bug that when none in a vector of tensors requires gradient, the conversion to InferShapeContext to InferMetaContext (a.k.a. BuildInferMetaContext) produces errorous results. * fix ci bugs * fix ci bugs * fix ci bugs * modify code according comment Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>
-
由 Rayman 提交于
-
由 Paulina Gacek 提交于
* op migrated, Copy(OneDNNContext, ...) added * mutable_data & op registration in fluid removed * refactoring * OneDNNGetDataType to uppercase * missing cpu check added, handler moved to .h file * name changed to transpose_grad * Copy changed back to TensorCopy * Resizing corrected, Copy(OneDNNContext) removed
-
由 Rayman 提交于
-
由 Rayman 提交于
support fp16 for deformable conv
-
- 09 10月, 2022 4 次提交
-
-
由 zhangkaihuo 提交于
-
由 zhangkaihuo 提交于
-
由 Sławomir Siwek 提交于
-
由 Sławomir Siwek 提交于
* enable hard_swish_grad unit test * remove unused argument
-
- 08 10月, 2022 1 次提交
-
-
由 HongyuJia 提交于
-
- 03 10月, 2022 1 次提交
-
-
由 Jacek Czaja 提交于
* - some more MD changes * - lint * - compilation fixes * - compilation fixes * - lint * - fix
-
- 30 9月, 2022 10 次提交
-
-
由 engineer1109 提交于
* Fix undefined reference PD_IntArrayGetElementCount * Delete PD_IntArrayGetSize Unused
-
由 Zhang Zheng 提交于
* Optimize performance of depthwise_conv_bwd of filter * op-benchmark * fix * op benchmark * merge bwd
-
由 Zhang Zheng 提交于
* Optimize performance of depthwise_conv_bwd * fix
-
由 ykkk2333 提交于
* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun * migrate add_n kernep to phi, test=kunlun * fix bugs of tipc, test=kunlun
-
由 HongyuJia 提交于
-
由 HongyuJia 提交于
-
由 六个骨头 提交于
-
由 HongyuJia 提交于
-
由 HongyuJia 提交于
-
由 sneaxiy 提交于
* support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * add bfloat16 to selu_grad to pass CI * fix selu grad compilation error
-
- 29 9月, 2022 9 次提交
-
-
由 carryyu 提交于
-
由 Zhang Zheng 提交于
* Move valid check from python to kernel * fix error throw * fix * invalid label check * fix * Revert "fix" This reverts commit 79fad6799cfa4b30423dbc84e67d7d843d22b84a. * Revert "invalid label check" This reverts commit 402a9707390ad5386b3222e85844b92d2e9b9fa4. * Revert "fix" This reverts commit 09ba3080ee0587447f875c19cdf060485f15ae3b. * Revert "fix error throw" This reverts commit a901bfcc2179d5c120ec29af766f392b122dab52. * Revert "Move valid check from python to kernel" This reverts commit baa03cc4ef82d8d45516c30dfb52bf5aead30748. * final fix * fix * fix
-
由 Leo Guo 提交于
Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)
-
由 carryyu 提交于
* fix P40 topk: Make the optimized topk compatible with P40. * fix P40 topk: Make the optimized topk compatible with P40. * fix P40 topk: Make the optimized topk compatible with P40.
-
由 ming1753 提交于
-
由 傅剑寒 提交于
-
由 HongyuJia 提交于
* select highest priority layout * opt performance, save virtual table find
-
由 HongyuJia 提交于
* add datatype check for ParseKernelKeyByInputArgs * polish error message * Actually, einsum has vector<Tensor> inpute with DataType::COMPLEX64, see test_einsum_v2.py * headerfile remove enforce.h
-
由 houj04 提交于
* [XPU] update xpu cmake to 0923. test=kunlun * [XPU] update xpu cmake to 0928. test=kunlun
-
- 28 9月, 2022 9 次提交
-
-
由 Chen Weihang 提交于
* remove needless using tensor * remove needless using tensor * resolve conflict * replace tensor using * fix format error * revert needless changing * fix rocm and npu compile error * fix cinn compile error * fix format error * fix mkldnn format error * fix mkldnn format error * fix cinn compile error * fix cinn compile error * fix cinn compile error * resolve conflict
-
由 HongyuJia 提交于
-
由 HongyuJia 提交于
* change BackendSet from 64bits to 32bits * fix _MSC_VER error, __lzcnt32->__lzcnt * fix __GNUC__ error, __builtin_clzl->__builtin_clz
-
由 limingshu 提交于
-
由 Sławomir Siwek 提交于
* Relu6 * remove fluid handler * add individual kernel signature * coding style * replace bounded_relu with clip * whitespace * code style
-
由 YuanRisheng 提交于
-
由 YuanRisheng 提交于
* fix concat bug * fix ci bugs * fix ci bugs
-
由 kangguangli 提交于
* add gpu kernel for transfer layout * comment error throw * fix: flag setting in testcase; add condition check for raising error * fix typo * fix: add error type for PADDLE_THROW * remove kernel fallback in data_transfer.cc * remove useless variable definition
-
由 wanghuancoder 提交于
* phi support xpu black list
-