- 03 2月, 2023 1 次提交
-
-
由 Sławomir Siwek 提交于
* replace matmul with matmul_v2 in fuse passes * Remove fusion logic from matmul * removing fusion methods * add proper name * adjust namespaces * clean attrs in python tests * delete checkpoint and restore matmul version * remove unused code * matmul and reshape/transpose fuses migrated * split MatmulOneDNN headers * fuse activation and eltwise_add * add fuse_activation * matmul_transpose_reshape/reshape_transpose_matmul * matmul + elementwise_add (fused) * activation temporary modifciation * merge newest develop * remove depedency from other PR * revert pbtxt * remove placeholders from matmul_v2 * add description in OPMaker * remove matmul_v2_op.h and all depedencies * remove dims changing in base op * add possibility to fuse already fused_matmul * restart broken CI * Empty-Commit * revert matmul_utils.h * codestyle * adjust imports * add pbtxt file * 100% matmul unit tests coverage * trigger CI with minimal changes to develop * adjust changes to develop * add fused_matmul op * inherit base ops * add "v2" * move OPMaker * Gradually add fused_matmul files * second batch of fused_matmul changes * split infershapes of matmul_v2 and fused_matmul * inherit fused_matmul from matmul_v2 * Update paddle/phi/backends/onednn/onednn_reuse.h Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> --------- Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
- 02 2月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 01 2月, 2023 1 次提交
-
-
由 zhangyikun02 提交于
-
- 31 1月, 2023 2 次提交
-
-
由 wangshengxiang 提交于
-
由 ronnywang 提交于
* [CustomDevice] add custom device api * update * update * test=document_fix * update * update * add examples
-
- 30 1月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support stream priority for standalone executor * Fix compile error * Fix compile error * Fix compile error * Fix compile error * Fix compile error
-
- 19 1月, 2023 1 次提交
-
-
由 jameszhang 提交于
* [KUNLUN] add op: maxpool_with_index * use DeviceContext::Alloc() instead of DenseTensor::mutable_data() * fix file format * solve clip unittest failure * minor fix * Revert "solve clip unittest failure" since the issue is fixed in #49535 This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b. * align with xdnn on the definition of mask in max_pool_with_index * minor
-
- 18 1月, 2023 4 次提交
-
-
由 Sławomir Siwek 提交于
* extract fuse pass logic to header file * adjust namespaces * Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h update date Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * add inline remove static Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 RuohengMa 提交于
* add reduce_sum_int64 and reduce_sum_int8 xpu kernels * [PHI] add clip grad kernel with support type float32 and int32 * [PHI unittest] add clip_grad unit test * adapt code to clang-format * update xpu api output with clip_grad api * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests * adapt license date, add code for XPUDataType convertion * add int8 support of reduce_sum * add reduce_sum unit tests for dtype int64, int8, and add more test cases * update license date * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel * change license date
-
由 houj04 提交于
-
由 jameszhang 提交于
* revert to use default XPU stream for computing XPUContext now has a null stream by default. If you want to use a separate stream (e.g. in async collective communication), you should create a dedicated XPUContext and invoke its XPUContext::CreateStream() * minor
-
- 16 1月, 2023 1 次提交
-
-
由 QingshuChen 提交于
-
- 13 1月, 2023 5 次提交
-
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
由 jameszhang 提交于
* kunlun add support for c_concat and c_split * replace mutable_data() and ShareDataWith()
-
由 ykkk2333 提交于
-
由 jameszhang 提交于
* fix xpu unittest issue: zero_dim_tensor * deal with leftout issue introduced by #49470
-
由 wangshengxiang 提交于
-
- 12 1月, 2023 3 次提交
-
-
由 YuanRisheng 提交于
-
由 Leo Guo 提交于
xpu2_op_list.cc. test=kunlun
-
由 YuanRisheng 提交于
* rename kernel * delete sig * modify code according comment * fix ci bugs
-
- 10 1月, 2023 2 次提交
- 09 1月, 2023 2 次提交
-
-
由 QingshuChen 提交于
-
由 ykkk2333 提交于
* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun * fix dlrm throughput problem, test=kunlun * add xpu einsum, fill_diagonal, and diagonal kernels, test=kunlun
-
- 06 1月, 2023 3 次提交
- 03 1月, 2023 1 次提交
-
-
由 limingshu 提交于
-
- 27 12月, 2022 1 次提交
-
-
由 zhangyikun02 提交于
-
- 26 12月, 2022 1 次提交
-
-
由 ykkk2333 提交于
* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun * fix dlrm throughput problem, test=kunlun
-
- 23 12月, 2022 2 次提交
-
-
由 haosicheng 提交于
-
由 Hui Zhang 提交于
* add warp transducer code
-
- 22 12月, 2022 1 次提交
-
-
由 QingshuChen 提交于
-
- 20 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* move dropout_impl from fluid to phi * move cuda_graph_with_memory_pool from fluid to phi * update namespace * remove cuad_graph in fluid * fix mac-build * fix bugs * correct CodeStyle * fix mac-build * fix mutable_data * fix stl include * fix copy param
-
- 19 12月, 2022 2 次提交
-
-
由 Wen Sun 提交于
-
由 zhangyikun02 提交于
-
- 17 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 16 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 15 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
-
- 14 12月, 2022 1 次提交
-
-
由 james 提交于
* nullptr bugfix for XPU pg mode Also a few kernels is added to xpu whitelist * increase error msg length
-