- 29 11月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
-
- 28 11月, 2022 4 次提交
-
-
由 Wang Bojun 提交于
* add trt support
-
由 huangjiyi 提交于
* decouple cudnn_desc.h from fluid * move cudnn_desc.h from fluid to phi * fix bugs * decouple cudnn_helper.h from fluid * fix bugs * move cudnn_helper.h from fluid to phi * add fluid cudnn_helper.h * move miopen_desc.h from fluid to phi * move miopen_helper.h from fluid to phi * fix bugs * move gpu_dnn.h from fluid to phi * fix bugs * update copyright year * simplify gpu_dnn.h in fluid * fix bugs * fix xpu build bug * fix compile bug * fix bug
-
由 张春乔 提交于
-
由 MarDino 提交于
-
- 23 11月, 2022 1 次提交
-
-
由 MarDino 提交于
* use fused mlp in multi transformer * Restruct code * use cublaslt to fuse ffn * fix conflict
-
- 22 11月, 2022 2 次提交
-
-
由 Tian Zheng 提交于
* Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * rm dependence to "gpu_device_function.h" in fluid * rm gpu_device_function.h etc in fluid * fix rocm-complie bugs * fix cuda_helper_test.cu bugs
-
- 21 11月, 2022 1 次提交
-
-
由 lzy 提交于
* use mma for QK dot computing in fused_multi_transformer. * Update fused_multi_transformer_op.cu.h
-
- 18 11月, 2022 3 次提交
-
-
由 MarDino 提交于
* fused qkvBiasAdd and transpose with split qkv * fix typo * fix format * fix name * add annotation * fix comment
-
由 MarDino 提交于
* Add quick gelu and fused bias add kernel * fix annotation * remove useless code * add fast gelu option and set it in multi transformer op * add flag to restrict if use fast gelu approximate * fix flags conflict * fix use tanh function instead * add cudart version limit * use phi fast tanh func * fix comment
-
由 Wang Xin 提交于
* remove "gpu_primitives.h" in fluid namespace * fix PR-CI-GpuPS fail * fix PR-CI-GpuPS fail
-
- 17 11月, 2022 2 次提交
-
-
由 YuanRisheng 提交于
* standard api * fix xpu bugs
-
由 taixiurong 提交于
-
- 15 11月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency
-
- 09 11月, 2022 2 次提交
-
-
由 HongyuJia 提交于
-
由 Jacek Czaja 提交于
* first commit - more fixes - compilation fix - compilation fix - fix - another fix - yet another fix - Fix - fix to fused ops - compilation fix - compilation fix - another compilation fix - another fix - fix - fix - fix - fix - yet another fix - fix - fix - cosmetic fix :- lint - Revert some changes (to be brought back later) - fix to build - Added prototype of slice - fix compilation fix - compilation fix - fix - fix - Fix - fix fix modified: cmake/flags.cmake * lint * rerun of CI * - Fix * - lint * - lint2
-
- 07 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 01 11月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* add extra attr property set * add type_info for all context * add onednn context to all context * fix context compile error * simplify conv kernel args * pass runtime attr into dev_ctx * fix marco error * clear conv_grad_kernel extra args * merge conv_grad_grad into conv_grad * clear conv2d_grad_grad extra attrs * clear yaml and eager extra attr * fix conv1d error * change to thread local * fix npu compile failed * try to fix windows compile failed * add conv2d onednn phi kernel * fix ci bugs (#36) * fix compile bugs (#38) * fix extra input transform bug (#39) * support dynamic created attr (#40) * reset extra info gen code * rm conv_grad_grad kernel * reimpl pass attr adapting * add int attr support * remove vector inputnames creating * fix map at error * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com> * remove useless extra attrs * replace mkldnn_engine by onednn_engine Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
- 31 10月, 2022 2 次提交
-
-
由 feng_shuai 提交于
* optimize: vit 384 * fix:bug * fix:bug * fix:supoort rocm complie * refactor:name * fix:support rocm * fix:__HIP_NO_HALF_CONVERSIONS__ * optimize: delete scalar * fix:rocm can't support * fix:ernie error
-
由 Nyakku Shigure 提交于
* fix typo `Fasle`/`Flase` -> `Flase` * fix typo `Ture` -> `True`
-
- 27 10月, 2022 1 次提交
-
-
由 Shijie 提交于
-
- 26 10月, 2022 2 次提交
- 25 10月, 2022 1 次提交
-
-
由 HongyuJia 提交于
-
- 24 10月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 17 10月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* namespace modify * update by comment
-
- 13 10月, 2022 1 次提交
-
-
由 HongyuJia 提交于
* remove PADDLE_WITH_MKLDNN, test white_list=abs * fix unique_ptr * fix op.Type() * remove TODO in kernel_dispatch.h * remove IndicateVarDataType function, update white_list * remove mkldnn hard code * add comments * fix == * update mkldnn_op_list * delete hard code of OPs * update mkldnn_op_list * update mkldnn_op_list, remove interp * add error check for ExecutionContext * update mkldnn_op_list, remove transpose2_grad * remove interpolate mkldnn * remove fill_constant mkldnn * opt HasAttr in DygraphExecutionContext * deprecated commit, test mkldnn_white_list * deprecated commit, test mkldnn_white_list * deprecated commit, test mkldnn_black_list * update mkldnn_op_list, add assert error op * solve cudnn related op * fix error * add mkldnn fallback in phi_utils.cc * remove mkldnn fallback in phi_utils.cc * opt code implementation * polish Copyright License
-
- 11 10月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* remove using lodtensor part1 * polish history code format
-
- 10 10月, 2022 2 次提交
- 09 10月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 30 9月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * add bfloat16 to selu_grad to pass CI * fix selu grad compilation error
-
- 28 9月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* remove needless using tensor * remove needless using tensor * resolve conflict * replace tensor using * fix format error * revert needless changing * fix rocm and npu compile error * fix cinn compile error * fix format error * fix mkldnn format error * fix mkldnn format error * fix cinn compile error * fix cinn compile error * fix cinn compile error * resolve conflict
-
- 21 9月, 2022 1 次提交
-
-
由 jiahongyu 提交于
-
- 18 9月, 2022 1 次提交
-
-
由 RichardWooSJTU 提交于
-
- 15 9月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
-
- 09 9月, 2022 2 次提交
-
-
由 xiaoxiaohehe001 提交于
-
由 sneaxiy 提交于
-
- 08 9月, 2022 1 次提交
-
-
由 taixiurong 提交于
* add gemm_epilogue * xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
-