- 17 2月, 2023 3 次提交
-
-
由 Sławomir Siwek 提交于
* change SHA * update to oneDNN 2.7 * update to 2.7.1 * update to 2.7.2 * add supported hardsigmoid * update to 2.7.3 * limit cpu threads for int8 test * group activations
-
由 houj04 提交于
* [XPU] add fp16 support for cumsum and log. * [XPU] add fp16 support for cumsum and log.
-
由 zhupengyang 提交于
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
-
- 16 2月, 2023 4 次提交
-
-
由 shentanyue 提交于
* support xpu multi-card infer * add ut * clean code * clean code * fix * fix * fix * fix
-
由 houj04 提交于
* [XPU] update xccl to 1.0.8 * update xdnn. add uint8 for concat and split. * update xdnn to 20230215.
-
由 ronnywang 提交于
* [XPU] add group_norm kernel * update * add xpu sin, cos, randint, linspace kernels * update * update
-
由 zhupengyang 提交于
-
- 15 2月, 2023 4 次提交
-
-
由 YuanRisheng 提交于
* move profiler * add file * fix mac compile bugs * fix ci bugs * fix mac bugs * fix ci bugs * fix compile bugs * perfect code according comment
-
由 zhangyikun02 提交于
-
由 QingshuChen 提交于
-
由 YuhangLi 提交于
* [CUSTOM]custom device add black_list * change log level * fix some issues
-
- 14 2月, 2023 1 次提交
-
-
由 engineer1109 提交于
fix X remove TensorCopy codestyle add fluid memory header fix symbol fix cmake fix cmake fix context fix header fix place fix context fix context fix context fix code fix custom context fix custom context fix copy fix data_transform fix style remove changes of custom fix scalar
-
- 13 2月, 2023 1 次提交
-
-
由 ykkk2333 提交于
* add xpu adagrad and where_grad kernels, test=kunlun * add xpu pool3d kernels, test=kunlun
-
- 10 2月, 2023 4 次提交
-
-
由 Leo Guo 提交于
d_bias are nullptr. Modify the code style of full_kernel.cc. Add new data type for concat, elementwise_add, gather, scale, scatter ops. test=kunlun
-
由 zhupengyang 提交于
-
由 Huang Jiyi 提交于
* rm gradient_accumulator in phi * update
-
由 wangshengxiang 提交于
-
- 09 2月, 2023 2 次提交
-
-
由 Leo Guo 提交于
-
由 zhangyikun02 提交于
-
- 06 2月, 2023 1 次提交
-
-
由 risemeup1 提交于
-
- 03 2月, 2023 1 次提交
-
-
由 Sławomir Siwek 提交于
* replace matmul with matmul_v2 in fuse passes * Remove fusion logic from matmul * removing fusion methods * add proper name * adjust namespaces * clean attrs in python tests * delete checkpoint and restore matmul version * remove unused code * matmul and reshape/transpose fuses migrated * split MatmulOneDNN headers * fuse activation and eltwise_add * add fuse_activation * matmul_transpose_reshape/reshape_transpose_matmul * matmul + elementwise_add (fused) * activation temporary modifciation * merge newest develop * remove depedency from other PR * revert pbtxt * remove placeholders from matmul_v2 * add description in OPMaker * remove matmul_v2_op.h and all depedencies * remove dims changing in base op * add possibility to fuse already fused_matmul * restart broken CI * Empty-Commit * revert matmul_utils.h * codestyle * adjust imports * add pbtxt file * 100% matmul unit tests coverage * trigger CI with minimal changes to develop * adjust changes to develop * add fused_matmul op * inherit base ops * add "v2" * move OPMaker * Gradually add fused_matmul files * second batch of fused_matmul changes * split infershapes of matmul_v2 and fused_matmul * inherit fused_matmul from matmul_v2 * Update paddle/phi/backends/onednn/onednn_reuse.h Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> --------- Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
- 02 2月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 01 2月, 2023 1 次提交
-
-
由 zhangyikun02 提交于
-
- 31 1月, 2023 2 次提交
-
-
由 wangshengxiang 提交于
-
由 ronnywang 提交于
* [CustomDevice] add custom device api * update * update * test=document_fix * update * update * add examples
-
- 30 1月, 2023 1 次提交
-
-
由 Ruibiao Chen 提交于
* Support stream priority for standalone executor * Fix compile error * Fix compile error * Fix compile error * Fix compile error * Fix compile error
-
- 19 1月, 2023 1 次提交
-
-
由 jameszhang 提交于
* [KUNLUN] add op: maxpool_with_index * use DeviceContext::Alloc() instead of DenseTensor::mutable_data() * fix file format * solve clip unittest failure * minor fix * Revert "solve clip unittest failure" since the issue is fixed in #49535 This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b. * align with xdnn on the definition of mask in max_pool_with_index * minor
-
- 18 1月, 2023 4 次提交
-
-
由 Sławomir Siwek 提交于
* extract fuse pass logic to header file * adjust namespaces * Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h update date Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * add inline remove static Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 RuohengMa 提交于
* add reduce_sum_int64 and reduce_sum_int8 xpu kernels * [PHI] add clip grad kernel with support type float32 and int32 * [PHI unittest] add clip_grad unit test * adapt code to clang-format * update xpu api output with clip_grad api * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests * adapt license date, add code for XPUDataType convertion * add int8 support of reduce_sum * add reduce_sum unit tests for dtype int64, int8, and add more test cases * update license date * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel * change license date
-
由 houj04 提交于
-
由 jameszhang 提交于
* revert to use default XPU stream for computing XPUContext now has a null stream by default. If you want to use a separate stream (e.g. in async collective communication), you should create a dedicated XPUContext and invoke its XPUContext::CreateStream() * minor
-
- 16 1月, 2023 1 次提交
-
-
由 QingshuChen 提交于
-
- 13 1月, 2023 5 次提交
-
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
由 jameszhang 提交于
* kunlun add support for c_concat and c_split * replace mutable_data() and ShareDataWith()
-
由 ykkk2333 提交于
-
由 jameszhang 提交于
* fix xpu unittest issue: zero_dim_tensor * deal with leftout issue introduced by #49470
-
由 wangshengxiang 提交于
-
- 12 1月, 2023 3 次提交
-
-
由 YuanRisheng 提交于
-
由 Leo Guo 提交于
xpu2_op_list.cc. test=kunlun
-
由 YuanRisheng 提交于
* rename kernel * delete sig * modify code according comment * fix ci bugs
-