- 28 11月, 2022 3 次提交
-
-
由 huangjiyi 提交于
* rm fluid “xpu_header.h” deps in phi * move part of xpu_op_list.h from fluid to phi * add fluid xpu_op_list deps * add glog deps for xpu_op_list in phi * fix PR-CI-Kunlun
-
由 haosicheng 提交于
-
由 张春乔 提交于
* Update communicator.cc * Update communicator.cc * remove LoDTensor * remove LoDTensor and Tensor
-
- 25 11月, 2022 3 次提交
-
-
由 Chitsing KUI 提交于
* attr ready * op ip ready * start dynamic * end2end ok * input shape to map, stat by op * layer wip * first version ready * fix proto depds * fix profiler deps * fix flops typo, rm tuple shape
-
由 Nyakku Shigure 提交于
-
由 houj04 提交于
-
- 24 11月, 2022 4 次提交
-
-
由 zhangyikun02 提交于
-
由 zhangyikun02 提交于
-
由 huangjiyi 提交于
* rm dependence to "convert_utils.h" in some files * fix bugs * replace DataType2String with DataTypeToString * replace framework::DataTypeSize with phi::SizeOf * mv convert_function from fluid to phi and rm old map * recommit with pre-commit * repalce ProtoVarType with ProtoDataType and update comment. * fix error about include "dnnl.hpp" * revert add dep mkldnn to convert_utils in phi * add mkldnn deps in convert_utils.h in phi * move deps to convert_utils.h in phi
-
由 PuQing 提交于
-
- 23 11月, 2022 3 次提交
-
-
由 ykkk2333 提交于
* add stat tool * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun * add masked_selected_grad kernel,test=kunlun
-
由 sneaxiy 提交于
* make bfloat16 implicit convert to float/double * fix bfloat16_test ut compile
-
由 zhangyikun02 提交于
-
- 22 11月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * rm dependence to "gpu_device_function.h" in fluid * rm gpu_device_function.h etc in fluid * fix rocm-complie bugs * fix cuda_helper_test.cu bugs
-
- 21 11月, 2022 2 次提交
-
-
由 taixiurong 提交于
-
由 PuQing 提交于
-
- 18 11月, 2022 4 次提交
-
-
由 zyfncg 提交于
* fix bug of zero_allocator in host * fix test compile bug * add unittest * update test
-
由 Tian Zheng 提交于
* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation * Fix macro * Add implementation for conv_kernel and conv_grad_kernel * Modification after rebase onto latest develop * Modify plan cache to comply with the API of phi::autotune * Refactor to reduce duplicate code * Review fix: - move functions in conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu - add const specifier for input tensor - add logging when plans fail to execute - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h * - move plan building outside of cache * Fix ROCM build
-
由 Wang Xin 提交于
* remove "gpu_primitives.h" in fluid namespace * fix PR-CI-GpuPS fail * fix PR-CI-GpuPS fail
-
由 zhangyikun02 提交于
-
- 17 11月, 2022 3 次提交
-
-
由 taixiurong 提交于
-
由 Wang Xin 提交于
-
由 sneaxiy 提交于
* add vectorized bfloat16 atomicAdd * fix compile error * fix compile error again * fix V100 compile error * fix V100 compile again
-
- 16 11月, 2022 1 次提交
-
-
由 hong 提交于
* remove avx check * fix bug;
-
- 15 11月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency
-
- 11 11月, 2022 2 次提交
-
-
由 czr-gc 提交于
* feat(ipu): add model_runtime backend support in IPU. * fix(ipu_executor): fix error message format. * fix(ipu_executor): fix format. * fix(ipu_executor): fix format again. * fix(ipu_executor): fix format again. * fix(ipu_executor): fix format again.
-
由 james 提交于
phi::Alloc() complains about missing device_allocator_
-
- 10 11月, 2022 1 次提交
-
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
- 09 11月, 2022 1 次提交
-
-
由 Jacek Czaja 提交于
* first commit - more fixes - compilation fix - compilation fix - fix - another fix - yet another fix - Fix - fix to fused ops - compilation fix - compilation fix - another compilation fix - another fix - fix - fix - fix - fix - yet another fix - fix - fix - cosmetic fix :- lint - Revert some changes (to be brought back later) - fix to build - Added prototype of slice - fix compilation fix - compilation fix - fix - fix - Fix - fix fix modified: cmake/flags.cmake * lint * rerun of CI * - Fix * - lint * - lint2
-
- 08 11月, 2022 2 次提交
-
-
由 zhangyikun02 提交于
-
由 zhangyikun02 提交于
-
- 07 11月, 2022 5 次提交
-
-
由 Hui Zhang 提交于
* suqeeze2 transpose2 fuse onednn * format * fix output shape * fix conflict * format * format * remove useless * remove log * simply pass * fix comment * fix * fix msg * fix error msg * format
-
由 QingshuChen 提交于
*test=kunlun
-
由 ykkk2333 提交于
add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368) * add stat tool * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun
-
由 ronnywang 提交于
-
由 HongyuJia 提交于
* move cudnn hardcode outside GetExpectedKernelType * add header file * debug * update interpreter_util with hardcode * update interpreter_util headerfile * solve activation hardcode * debug with CI * add mkldnn_op_list header file * temporarily uncomment mkldnn * temporarily uncomment mkldnn * delete sequence_softmax cudnn hardcode * add hardcode to data_transfer.cc * update data_transfer headerfile * try fix segment fault * update cudnn&miopen_helper * reset HasAttr of DygraphExctnCtx * debug, this commit should pass all CI * debug should pass CI, temporarily disable activation * debug should pass CI * fix default_attr=nullptr bug * clean debug code * Call SetDnnFallback function in the base class * activation fallback to plain kernel * fix default GetExpectedKernelType find wrong kernel * search cudnn kernel instead of fallback * fix cudnn_handle bug * remove tanh use_cudnn * restore tanh use_cudnn * debug tanh * fix tanh bug * delete activation cudnn kernel * polish code
-
- 05 11月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 04 11月, 2022 3 次提交
-
-
由 houj04 提交于
* [XPU] add cumsum op. test=kunlun * try to fix linker. test=kunlun * try to fix linker. test=kunlun * try to fix linker. test=kunlun * debug. test=kunlun * update xpu.cmake. remove unnecessary codes. test=kunlun.
-
由 ykkk2333 提交于
-
由 jakpiase 提交于
* tmp save * minor chnage * CI fix * added FC optimizations * latest update * CI fix * fixed bug with fusing fc
-