- 10 4月, 2023 3 次提交
-
-
由 HongyuJia 提交于
* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc * Add gflags.h for other files * Add gflags.h for other files * Add gflags.h for blas_impl.hip.h * Add gflags.h for miopen_helper.h
-
由 Vvsmile 提交于
* adjust defalut tolerance of output and grad * fix a bug in the grad of OpTest * fix the type of setting defalut value in optest, both forward and backward * add defalut * fix test_sum_op * adjust tolerance * fix the tolerance of eager * add bf16 and fp16 to the activation tests * remove some fixs * fix activation * fix fp16 * fix gelu * fix the activation tests * add bfloat16 specialization to singrad and cosgrad * fix bugs * fix bugs * add unittest * add skip * add fp/bf to rrelu/rrelu_grad * git add rrelu * fix bugs
-
由 Galaxy1458 提交于
* delete [-Wno-error=terminate], test=develop * remove GPUps[-Wterminate],test=develop * remove some -Wno-, test=develop * modify ~MatmulDescriptor * mess
-
- 07 4月, 2023 1 次提交
-
-
由 Wang Xin 提交于
-
- 06 4月, 2023 3 次提交
-
-
由 yuehuayingxueluo 提交于
-
由 Sonder 提交于
* add kernel functions * update kernel functions * update func parameters' name * create codes for gpu device * 调整文件位置 * fix include error * remove dependent files to phi/ * restore fused_attention_op.cu * fix dependence errors * fix dependence errors * fix include error * fix all depandence errors[build success] * remove useless include * recover useless include * use phi::ToNCCLDataType * fix namespace * update new register code * fix error in fused_gemm_epilogue_utils * fix error in FusedAttentionKernel parm * finish fused_attention registe code[build success] * add paddle::optional * add sig file * fix build error * fix a include error * update CMkaeList * fix parameter sequence * add include file * update #if before include * fix grammly error * update codes for DropoutParam * remove const cast * trans some fluid api to phi api * add #if * update test code * update test codes * recover test codes * trans fused_attention to fluid * move #endif to end * move #endif * delete useless files * use fused attention utils and recover random seed * remove fluid include in phi
-
由 张春乔 提交于
-
- 04 4月, 2023 1 次提交
-
-
由 chenxujun 提交于
* Add pool3d lgamma masked_select tests * Fix code
-
- 03 4月, 2023 1 次提交
-
-
由 engineer1109 提交于
-
- 31 3月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 30 3月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 29 3月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* add fuse adamw pass * fix some bugs * fix CIbug * change chunk_size * fix CI bug * rm test_fused_adam_op.py * fix CI bugs * fix fuse_adamw_op_pass.cc * change code style * fix CI bug * fix ut bug and use_adamw_op_pass.cc * fix test_fuse_adamw_pass.py * fix CI bug * remove fluid * fix ci bug * fix CI bug
-
- 25 3月, 2023 1 次提交
-
-
由 Ruibin Cheung 提交于
[Fix Bug] fix get_new_shape and get_new_data_from_tensor not support fallback to CPU on custom device (#52002)
-
- 24 3月, 2023 3 次提交
-
-
由 YuanRisheng 提交于
* decouple memory copy * fix ci bugs * fix ci compile bugs * fix rocm compile * fix ci bugs * decouple memory * deal with conflict * fix xpu compile bugs * fix xpu bugs * deal with xpu bugs * fix cmake bugs * fix windows bugs * fix ci bugs * fix ci bugs * delete redundance code * add code for pybind * fix py3 bugs * fix ci bugs
-
由 thunder95 提交于
* untracked files * kthvalue perf * remove unused files * fix isnan * fix isnan2 * fix bug * try to fix rocm error
-
由 ZhangDY-6483 提交于
* first version, notest * return final rst, notest * use infinity() instead of max * ut structure * start up of ut * generate lse * update * add depense * reconstruct cmake * move file * add memory efficient attention and fix blasimpl * update * update cmake * add namespace * update cmake * use .cu * update for pad3d * bug fix * bug fix * update * bug fix * update enforce * add test case * merge the lse pad * fix kernel_fn of backward * fix PADDLE_ENFORCE_EQ and phi_api * fix PADDLE_ENFORCE * fix PADDLE_ENFORCE * rerun coverage * fix memory efficient attention test * rerun ci * add cuda version condition * add cuda version condition * delete WIP test * replace PADDLE_ENFORCE * edit the namespace of datatype in multiple.cc * rerun * rerun --------- Co-authored-by: Nliuyuang <liuyuang@baidu.com>
-
- 23 3月, 2023 3 次提交
-
-
由 sneaxiy 提交于
* remove fluid deps in fused_linear_param_grad_add_kernel * fix compile error * fix ut error * follow comments
-
由 limingshu 提交于
* first commit * fix bugs * remove_useless sync
-
由 Lin Manhui 提交于
* Add bf16 support for elementwise_pow * Update ut
-
- 22 3月, 2023 4 次提交
-
-
由 YangQun 提交于
* support 0-d tensor for element wise unary ops * fix python code style check * fix approval check * support 0-d tensor for onednn softmax and logsoftmax kernels * fix commnets * fix some unittests
-
由 Bo Zhang 提交于
* test_logit_op * add cudaKernel to replace eigen impl * bf16 unit test CI
-
由 Zhang Zheng 提交于
This reverts commit 3b2cd23a.
-
由 Difer 提交于
-
- 21 3月, 2023 3 次提交
-
-
由 iSerendipity 提交于
* move DataType from paddle::experimental to phi * convert namespace * convert namespace * convert namespace * clarify namespace * convert more datatype * Revert "convert more datatype" This reverts commit 083b462959e6a22d4d8767707b628b95b396642e. * convert more in auto_code_generator * fix conflicts for XPU * fix namespace conflicts * fix errors * Revert "fix errors" This reverts commit f9d9958b54ee32141112274c8a5c3c381ab0f876. * fix errors * fix formatting
-
由 Zhang Zheng 提交于
-
由 Bo Zhang 提交于
* with printf * add DropOutNdForwardKernel * PR comment
-
- 20 3月, 2023 2 次提交
-
-
由 limingshu 提交于
* optimization for fused linear op * fix code format * optimization for linear fused forward * merge with develop * fix bugs for gemm_ephilog * package of cublaslt ephilogue type with enmu * final fix before code reviewing * fix missed fusedType typo * fix code according to review suggestions * fix windows ci error * change location of MatmulPlanner * add some changes for compiler error fix ---------
-
由 zhouweiwei2014 提交于
-
- 17 3月, 2023 2 次提交
-
-
由 Zhang Zheng 提交于
* [AMP OP&Test] Support float & bfloat16 when using cub * fix compile error * fix * fix rocm compile error
-
由 gouzil 提交于
* [phi][jit] rm Softmax StrideScal * [phi][jit] rm kStrideScal * [phi][jit] fix Softmax clean omission * [phi][jit] fix Softmax clean omission * [phi][jit] fix StrideScal clean omission * [phi][jit] fix mkl SoftmaxKernel clean omission * [phi][jit] fix test error * [phi][jit] fix test error * [phi][jit] rm NCHW16CMulNC * [phi][jit] fix test error * [phi][jit] rm HSum HMax * [phi][jit] fix test error * [phi][jit] rm StrideASum * add AUTHORS.md * [phi][jit] fix test error
-
- 15 3月, 2023 3 次提交
-
-
由 limingshu 提交于
-
由 thunder95 提交于
* untracked files * prelu_perf * remove unused files * upd * fix bug
-
由 iSerendipity 提交于
* Revert "Revert "【Hackathon No.67】remove operator.h in blas.h (#50989)" (#51467)" This reverts commit b9d91531. * remove cout * add header * fix missing header * fix refer fluid error * fix missing header * 更新 repeat_interleave_grad_kernel_impl.h Change to phi style datatype. * 更新 repeat_interleave_grad_kernel_impl.h Fix missing header * datatype fluid -> phi * paddle::experimental -> phi * fix reference error * fix reference error * fix reference error * fix errors * fix missing FLAGS * fix missing headers * fix missing headers * fix missing headers * fix missing headers * fix missing header * fix missing header * fix errors
-
- 14 3月, 2023 2 次提交
-
-
由 limingshu 提交于
* first commit * fix code bugs in for_loop * fix bugs in cuLoadAddStridedInputs. * optimization for LayerNormBackwardComputeGradInput * add unitest for validating the optimization * fix windows ci error
-
由 Huang Jiyi 提交于
* remove device_context include * fix bug * fix bug
-
- 13 3月, 2023 1 次提交
-
-
由 Huang Jiyi 提交于
* platform::CUDAPinnedDeviceContext -> phi::GPUPinnedContext * replace platform::TraceEventCollector
-
- 10 3月, 2023 2 次提交
-
-
由 YuanRisheng 提交于
This reverts commit 3f4917f6.
-
由 iSerendipity 提交于
* remove operator.h from blas.h and remove paddle::framework::ExecutionContext * remove the deps for GetBlas(exe_ctx) * fix error
-
- 09 3月, 2023 2 次提交
-
-
由 Yiqun Liu 提交于
* Add the collect and print of kernel registry infomation in op benchmark ci. * Little change to test the ci. * Remove the reduntant function. * Move the collect of kernel registry information to the end of ci.
-
由 will-jl944 提交于
* add softplus double grad * use constant method
-