- 13 4月, 2023 2 次提交
-
-
由 HongyuJia 提交于
* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h * Add logging.h for profiler.cc * Add logging.h for gloo_utils.h * Add logging.h for addmm_kernel_impl.h * Add logging.h for addmm_grad_kernel_impl.h * Add logging.h for p_send_kernel.cu * Add logging.h for determinant_grad_kernel_impl.h * Add logging.h for p_recv_kernel.cu * Add logging.h for elementwise_grad_base.h * Add logging.h for transfer_layout_kernel.cc * Add logging.h for eigvals_kernel.cc and index_select_impl.h * Add logging.h for all files in kernel directory * Add logging.h for xpu_info.cc * Add logging.h for xpu
-
由 zhangyuqin1998 提交于
-
- 12 4月, 2023 3 次提交
-
-
由 Zhang Zheng 提交于
* Optimize performance of unique kernel * fix ci
-
由 Wei Shengyu 提交于
* add bf16 support and bf16/fp16 unittest for pool2d * add include files * dbg * reformat * reformat * modify code according to review comment * remove duplicate code * remove dup code * remove useless include * dbg
-
由 Guoxia Wang 提交于
* [AMP OP&Test] support bf16 for batchnorm * codestyle * Update batch_norm_grad_kernel.cu * Update batch_norm_kernel.cu * fix codestyle * fix * fix * fix * fix * fix * Update batch_norm_kernel.cc
-
- 11 4月, 2023 3 次提交
-
-
由 WJJ1995 提交于
* add bfp16 test for isfinite * fixed for ci * deal with comments * fixed test * skip test in cpu * deal with comments * fixed for ci * fixed testcase * fixed for ci * fixed for testcase
-
由 LinearTemporalLogic 提交于
* Add output defs for eigh kernel * fix * update * update * fix * fix
-
由 Thomas Young 提交于
-
- 10 4月, 2023 8 次提交
-
-
由 Difer 提交于
* add_fp_bf_for_flip_gaussian_random * forget convert uint * fix some error * fix some error
-
由 cyberslack_lee 提交于
-
由 HongyuJia 提交于
* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc * Add gflags.h for other files * Add gflags.h for other files * Add gflags.h for blas_impl.hip.h * Add gflags.h for miopen_helper.h
-
由 Vvsmile 提交于
* adjust defalut tolerance of output and grad * fix a bug in the grad of OpTest * fix the type of setting defalut value in optest, both forward and backward * add defalut * fix test_sum_op * adjust tolerance * fix the tolerance of eager * add bf16 and fp16 to the activation tests * remove some fixs * fix activation * fix fp16 * fix gelu * fix the activation tests * add bfloat16 specialization to singrad and cosgrad * fix bugs * fix bugs * add unittest * add skip * add fp/bf to rrelu/rrelu_grad * git add rrelu * fix bugs
-
由 qizhaoaoe 提交于
* add fp16 and bf16 support for instance_norm * fix /= operator which not support bf16 * fix instance_norm_grad kernel and unittests. * fix fp32 unittests. * fix instance_norm_kernel and unittests. * fix instance_norm_grad_kernel and unittest threshold. * add fp16/bf16 for instance_norm_grad_grad op. * add bf16 dtype check. * fix conflicts. * fix cpu support for fp32 op and fix type in instance_norm_grad_kernel. * fix type in instance_norm_kernel. * fix bf16 outputs in unittests and refine codes. * fix dx computation. * delete unuseful params and head including. * add fp16/bf16 for static graph. * fix device condiction for instance_norm op. * fix instance_norm_grad_grad and bf16 op tests. * fix op_test to support grad of bf16 can be compared with fp32. * remove updates. * add self-defined grad.
-
由 Zero Rains 提交于
* fix divide zero bug for softmax_with_cross_entropy * change the single test way * can run but slow. the most important is that I do not know why it slow * remove some useless commet * change the copyright to correct * remove some useless change * if repeat_times == 1, we will not use BroadcastKernel
-
由 cyberslack_lee 提交于
-
由 Asthestarsfalll 提交于
* Optimize the performance of logsumexp * Support zero-dim tensor
-
- 09 4月, 2023 1 次提交
-
-
由 shaojie_wang 提交于
-
- 07 4月, 2023 1 次提交
-
-
由 TaoTao Li 提交于
fix merge conflicts
-
- 06 4月, 2023 4 次提交
-
-
由 zhangyuqin1998 提交于
* Rename conv2d transpose grad grad * fix
-
由 Chitsing KUI 提交于
-
由 sneaxiy 提交于
* fix flash attn * fix another API
-
由 LoneRanger 提交于
* add fp16 and bf16 for eye and frame * fix bug * fix bug * fix bug * Update test_frame_op.py fix code style * fix bug * fix bug
-
- 04 4月, 2023 3 次提交
-
-
由 chenxujun 提交于
* Add pool3d lgamma masked_select tests * Fix code
-
由 Ruibiao Chen 提交于
* Improve new executor static build * Skip GC for static build * Skip infershape for static build * Handle read_op * Add fused_attention to OpsWithFluidKernelNeedMoveToPhi * Fix argsort typos * Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi * Fix skip share lod errors * Fix errors for adam * Fix errors for eigvals, memcpy and fake_quantize * Add static_build.cc * Add black list * Fix CI errors * Fix CI errors * Fix CI errors * Fix TensorArray * Fix TensorArray * Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel * Fix copy * Fix errors * Fix momentum * Skip mkldnn * Fix CI errors * Fix c_sync_calc_stream_op * Fix CINN * Fix while op * All CI pass, disable FLAGS to merge code, enable it after more tests in future * Add UTs * Fix typos * Fix typos * Add mkldnn UT * Remove mkldnn test * Fix typos * Fix dist test * Fix typos * Fix CI errors * Fix CI errors * Add UTs * Fix typos * Fix typos * Add sparse tests * ToComplexType -> ToComplex * Add test_matmul_op_static_build to disable_win_inference_test
-
由 zhangyuqin1998 提交于
* rename_bilinear_tensor_product * fix
-
- 03 4月, 2023 4 次提交
-
-
由 denglianbin 提交于
* finish task * fix error * pre-commit fix code style * add unittest. * change unittest. * delete unittest case.
-
由 chenxujun 提交于
-
由 LoneRanger 提交于
【PaddlePaddle Hackathon 4】No.56 : add fp16 test and bf16 test for diag, diagonal, fill and fill_diagonal_tensor (#51649)
-
由 zhangyuqin1998 提交于
-
- 31 3月, 2023 2 次提交
-
-
由 zhangyuqin1998 提交于
-
由 YuanRisheng 提交于
* remove distribute * fix py3 bugs * fix gpu-ps bugs * fix compile bugs * fix unittest bugs
-
- 30 3月, 2023 3 次提交
-
-
由 Roc 提交于
-
由 yunyaoXYY 提交于
* add FP16 for multinomial * fix input data * update code * fix FP16 * fix code
-
由 Wang Xinyu 提交于
* stride slice fp16 and bf16 unitest * fix code style * add self.dtype
-
- 29 3月, 2023 2 次提交
- 28 3月, 2023 3 次提交
-
-
由 houj04 提交于
* fix int8 support for full kernel * fix ut.
-
由 Haohongxiang 提交于
-
由 wangxinxin08 提交于
* add unittest for conv2d/depthwise_conv2d/conv2d_transpose * add bf16 for DWConv and ConvTranspose * fix unitest of conv2d_transpose * modify DWConv2d op and unittest * fix unittest of conv2d_transpose_bf16 * modify unittest name according to review * modify atol of DWConv2D unittest
-
- 27 3月, 2023 1 次提交
-
-
由 Leo Chen 提交于
* unbind support bool dtype * replace np.array_equal
-