- 02 3月, 2023 4 次提交
-
-
由 HongyuJia 提交于
* polish codes according #50813 * [getCurrentCUDAStream] Add C++ API getCurrentCUDAStream * change get->Get * wrap with macro * use Get instead of get
-
由 Leo Chen 提交于
* register fp16 and bf16 kernel for uniform_random * fix compile * support selected_rows * add ut * revert cpu * fp16 test skip cpu
-
由 wangzhen38 提交于
* [cinn] concat_grad * [cinn] concat_grad * [cinn] concat_grad build success * [Add PGLBOX] fix unnitest * [Add PGLBOX] fix unnitest * [Add PGLBOX] fix codestyle * [cinn] update by comments * [cinn] update by comment * [cinn] add axis check
-
由 haosicheng 提交于
-
- 01 3月, 2023 12 次提交
-
-
由 Chitsing KUI 提交于
* flash attn * seed * almost * softmax * fix workspace * add unitest; linux only * fix setup * fix datatype include * fix setup typo * fix def scope * new error api * use paddle fork * fix attr bug; complete ut * update flash hash * fix rng reset * fix offset * fix comments
-
由 HongyuJia 提交于
* Add comments for #50886 * [Tensor Operants & Prim-Relevant] Tensor supports logical operants * add prim dynamic unit test * add prim static unit test
-
由 zqw_1997 提交于
* tmp gather vjp * support gather * remove useless code * fix compiling error * fix ut * add eager test * add eager test * add seed * small change * fix cpu error * fix transpose op compat * remove tensor index case * fix prim_cinn * small commit * add cumsum prim backward * small commit * skip aixs=None test case * fix op generante eror * fix static test error * remove unused code * fix static test error * small commit * skip cpu float16 test case * skip eager cpu cumsum float16 test case * add eager and static UT * fix ut * add composite backward rule * fix error * fix type error and format error * add try cpu+float16 test * fix test bugs * remove test for cpu+float16 and make y[0] be the grad arg * add cinn test * fix UT * fix the wrong dim of v in test cases * change y[0] to y[1] for grad in UT * reshape flatten out * Disable cinn single test * use scatter_nd_add * modify the reshape part of topk_grad * delete useless build file * to make the syntax right * modify bug * try use of put_along_axis * remove cinn test * reformat todo * add silu composite rule * fix code style. * add cinn test * fix composite grad maker code gen * add prim in cumsum op test * remove old test * fix typro * pass the static test * fix typro * modify optest and delete old test files * remove normal test_top_k_op test * fix typro * pass axis=None test case * buffer comment * for debug * add silu fp16 unit test. * add static guard * remove forward prim test * remove same name axis * modify the test_top_v2_op.py to pass all local tests * delete the useless testcase * fix mistake * add more testcases to test dtype16 and dtype32 --------- Co-authored-by: NJiabinYang <360788950@qq.com> Co-authored-by: NGGBond8488 <857631483@qq.com> Co-authored-by: Nzxcd <228587199@qq.com> Co-authored-by: NCharles-hit <wanghao107@baidu.com>
-
由 yunyaoXYY 提交于
* Add unitest from shilong * Add kernel code from shilong * fix codestyle * add broadcast_shape test * fix unitest * fix unitests * fix unitest * add 0D grad support * add 0D grad support * add 0D grad support * fix 0D tensor * fix 0D * fix xpu 0D * fix expand kernel * fix xpu expand * Fix 0D kernel * fix 0D * fix 0D * fix 0D * fix 0D * fix XPU top_k * cancel the modify of xpu * add XPU 0D tensor * fix 0D
-
由 wawltor 提交于
-
由 mayang002 提交于
-
由 chenxiao120660 提交于
* fix bug of logsumexp * fix bug for logsumexp * fix bug for logsumexp
-
由 cyber-pioneer 提交于
-
由 niuliling123 提交于
-
由 duanyanhui 提交于
* add support of int64 add for xpu * add transpose support for int64 * add randperm kernel * fix randperm * add distribute_fpn_proposal kernel * fix comment * add reduce_sum_int32
-
由 engineer1109 提交于
-
由 risemeup1 提交于
-
- 28 2月, 2023 9 次提交
-
-
由 gouzil 提交于
* [phi] move device_wrapper from fluid to phi * [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
-
由 HongyuJia 提交于
-
由 xiaoguoguo626807 提交于
* modify name * merge develop * original code * build modify * success 2*2 * fused dim=1 failed * success * modify static * success for static except dim=1 * delete log * tmp modify * success * success * add fp1664 * delete fp16 cpu test * stop windows test * review modify * modify tanh test * modify tanh * fix_conflixt * modift static prim * fix_conflict * Update test_static_prim.cc * update * bug fix
-
由 HongyuJia 提交于
* [C++ API GetAllocator] Add C++ GetAllocator interface * move api to accurate directory
-
由 GGBond8488 提交于
* add cumsum prim backward * skip aixs=None test case * fix op generante eror * fix static test error * remove unused code * fix static test error * skip cpu float16 test case * skip eager cpu cumsum float16 test case * add cinn test * reshape flatten out * Disable cinn single test * remove cinn test * reformat todo * add prim in cumsum op test * remove old test * fix typro * fix typro * fix typro * pass axis=None test case * remove forward prim test * remove same name axis
-
由 zhupengyang 提交于
-
由 shentanyue 提交于
-
由 taixiurong 提交于
-
由 Jiabin Yang 提交于
* support transpose and reshape * support reshpe, transpose, cast vjp * merge develop * recover unused file * remove prim base * support problem * remove additional status settting * remove additional status settting * fix ut * fix ut * fix ut * fix no grad branch * add more test * disable fp16 in cpu * fix test
-
- 27 2月, 2023 8 次提交
-
-
由 houj04 提交于
* [XPU] add fp16 support for shape op. * [XPU] add fp16 support for lookup_table_v2 op. * update approval list: add qingshu's id.
-
由 张春乔 提交于
* remove utils * remove utils * remove utils * remove utils * Update get_data_from_tensor.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_functor.h * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * remove utils * Update rnn_functor.h * remove utils * remove utils * remove utils * remove utils * remove utils * Update rnn_functor.h * Update unsqueeze_op.h * Update utils.h * roll back * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * Update rnn_kernel.cc * Update rnn_grad_kernel.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * add TensorToVector * roll back * Update tensor_utils.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update tensor_utils.h * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * TensorCopySync to phi::Copy * fix codestyle * rnn_kernel.cc: add ; * replace all GetDataFromTensor with phi::GetVectorFromTensor * delete include of util.h
-
由 HongyuJia 提交于
* [Tensor Operants & Prim] Tensor pow API uses elementwise_pow * unittest change to fill_constant+elementwise_pow
-
由 Bo Zhang 提交于
* conflict * add UpdateSliceAttrs
-
由 Yiqun Liu 提交于
-
由 zhouweiwei2014 提交于
-
由 wangshengxiang 提交于
* [XPU] bind op scatter_nd_add * [XPU] add more data type for op: clip, transpose2 & assign_value
-
由 shaojie_wang 提交于
* register bfloat16 datatype for squared l2 norm * register bfloat16 datatype for softmax with upper triangular mask * register bfloat16 for tril triu cuda kernel
-
- 26 2月, 2023 2 次提交
-
-
由 limingshu 提交于
* implement of matmul using cublasLt instead of cublas * Update matmul_kernel_impl_via_blasLt.h --------- Co-authored-by: Nzhangbopd <1299246947@qq.com> Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com> Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 Yiqun Liu 提交于
* Enable matmul + bias fusion in fused_gat_attention. * Add a variable to control whether using fused matmul + bias.
-
- 25 2月, 2023 1 次提交
-
-
由 zyfncg 提交于
* rename elementwise_heaviside to heaviside * delete __init__.py * fix bug
-
- 24 2月, 2023 4 次提交
-
-
由 yunyaoXYY 提交于
-
由 niuliling123 提交于
-
由 Yuanle Liu 提交于
-
由 HappyHeavyRain 提交于
* support 'backend' in static ops * change bitwise_xx comment in python * change bitwise_xxx comment in python * change 'backend' and 'data_type' in GetExpectedKernelType
-