- 18 11月, 2022 8 次提交
-
-
由 Tian Zheng 提交于
* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation * Fix macro * Add implementation for conv_kernel and conv_grad_kernel * Modification after rebase onto latest develop * Modify plan cache to comply with the API of phi::autotune * Refactor to reduce duplicate code * Review fix: - move functions in conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu - add const specifier for input tensor - add logging when plans fail to execute - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h * - move plan building outside of cache * Fix ROCM build
-
由 Yuang Liu 提交于
-
由 Wang Xin 提交于
* remove "gpu_primitives.h" in fluid namespace * fix PR-CI-GpuPS fail * fix PR-CI-GpuPS fail
-
由 zhangyikun02 提交于
-
由 feng_shuai 提交于
-
由 feng_shuai 提交于
-
由 Sylwester Fraczek 提交于
-
由 huangjiyi 提交于
-
- 17 11月, 2022 18 次提交
-
-
由 zyfncg 提交于
* clip extra and intermediate output of op * fix bug * fix bug * polich code * polich log
-
由 Qi Li 提交于
* [NPU] add _npu_identity op and api, test=develop * fix doc * address comments
-
由 Wen Sun 提交于
-
由 wenbin 提交于
* int scale * round * revert commit
-
由 xiongkun 提交于
-
由 hong 提交于
-
由 huangjiyi 提交于
-
由 YuanRisheng 提交于
* standard api * fix xpu bugs
-
由 Mountagha 提交于
-
由 taixiurong 提交于
-
由 Wang Xin 提交于
-
由 xiaoxiaohehe001 提交于
* add_cast_bool * cast
-
由 Yiqun Liu 提交于
* Implement a common dims simplifier. * Fix the include position error. * Reduce the cpu overhead of broadcast computing.
-
由 huangjiyi 提交于
-
由 huangjiyi 提交于
* rm "paddle/fluid/operators/math.h" in phi * rm "paddle/fluid/operators/math.h" in fluit
-
由 Yuang Liu 提交于
Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16 training with tensor fusion. (#48041) * add bfloat16 for adamw * set lr not to bfloat16 for pure bf16 training * update the logic * update the adamw optimizer * support bfloat for adam
-
由 sneaxiy 提交于
* add vectorized bfloat16 atomicAdd * fix compile error * fix compile error again * fix V100 compile error * fix V100 compile again
-
由 zyfncg 提交于
-
- 16 11月, 2022 14 次提交
-
-
由 huangjiyi 提交于
-
由 Qi Li 提交于
* [NPU] update npu prop, test=develop * remove ddim.h * remove diff * update storage prop, test=develop
-
由 xiaoxiaohehe001 提交于
* add_fill_any_like * add_fill_any_like
-
由 wenbin 提交于
* elementwise_op * add teller * modify ut * comments * modify ut * return * modify
-
由 Zhang Jun 提交于
-
由 Zhang Jun 提交于
-
由 HongyuJia 提交于
* simplify depthwise_conv2d phi kernel selection * fix depthwise_conv2d
-
由 Piotr Paturej 提交于
* Enable bf16 in oneDNN bilinear_interp kernel * Fix bilinear_interp_v2 not enabled in models * Remove unnecessary checks
-
由 ykkk2333 提交于
* add stat tool * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun * embedding and embedding_grad add int32 input, test=kunlun
-
由 hong 提交于
* remove avx check * fix bug;
-
由 Leo Chen 提交于
-
由 Wang Xin 提交于
-
由 Wen Sun 提交于
* refactor: update pg custom * fix: use new api in ut * fix: typo * revert: recover legacy apis * fix: add GetDeviceContext
-
由 czr-gc 提交于
-