- 03 3月, 2023 4 次提交
-
-
由 gouzil 提交于
* [phi] move jit kernels from fluid to phi * [phi] fix paddle::phi err * [phi] fix windows 'posix_memalign': identifier not found * [phi] fix windows 'posix_memalign_free': identifier not found * [phi] fix readme directory structure, fc_functor paddle::platform
-
由 YuanRisheng 提交于
* decouple memory copy * fix ci bugs * fix ci compile bugs * fix rocm compile * fix ci bugs
-
由 zhangkaihuo 提交于
-
由 niuliling123 提交于
-
- 02 3月, 2023 6 次提交
-
-
由 Ruibiao Chen 提交于
* Check structed kernel for new executor static build * Update code * Ready for resnet50 * Move transfer_dtype to phi * Ready for transformer * Fix CI errors * Fix layer_norm InferMeta * Remove layer_norm infermeta fix
-
由 limingshu 提交于
* first commit * finish base work * modification for good * fix for cache setting and gather the algo and desc as one data for cache storage * fix for cache setting and gather the algo and desc as one data for cache storage * install pre-commit check
-
由 chenxiao120660 提交于
-
由 ahahahahahaha 提交于
-
由 wangshengxiang 提交于
-
由 Leo Chen 提交于
* register fp16 and bf16 kernel for uniform_random * fix compile * support selected_rows * add ut * revert cpu * fp16 test skip cpu
-
- 01 3月, 2023 7 次提交
-
-
由 Chitsing KUI 提交于
* flash attn * seed * almost * softmax * fix workspace * add unitest; linux only * fix setup * fix datatype include * fix setup typo * fix def scope * new error api * use paddle fork * fix attr bug; complete ut * update flash hash * fix rng reset * fix offset * fix comments
-
由 yunyaoXYY 提交于
* Add unitest from shilong * Add kernel code from shilong * fix codestyle * add broadcast_shape test * fix unitest * fix unitests * fix unitest * add 0D grad support * add 0D grad support * add 0D grad support * fix 0D tensor * fix 0D * fix xpu 0D * fix expand kernel * fix xpu expand * Fix 0D kernel * fix 0D * fix 0D * fix 0D * fix 0D * fix XPU top_k * cancel the modify of xpu * add XPU 0D tensor * fix 0D
-
由 wawltor 提交于
-
由 mayang002 提交于
-
由 chenxiao120660 提交于
* fix bug of logsumexp * fix bug for logsumexp * fix bug for logsumexp
-
由 niuliling123 提交于
-
由 duanyanhui 提交于
* add support of int64 add for xpu * add transpose support for int64 * add randperm kernel * fix randperm * add distribute_fpn_proposal kernel * fix comment * add reduce_sum_int32
-
- 28 2月, 2023 4 次提交
-
-
由 gouzil 提交于
* [phi] move device_wrapper from fluid to phi * [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
-
由 zhupengyang 提交于
-
由 shentanyue 提交于
-
由 taixiurong 提交于
-
- 27 2月, 2023 7 次提交
-
-
由 houj04 提交于
* [XPU] add fp16 support for shape op. * [XPU] add fp16 support for lookup_table_v2 op. * update approval list: add qingshu's id.
-
由 张春乔 提交于
* remove utils * remove utils * remove utils * remove utils * Update get_data_from_tensor.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_functor.h * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * remove utils * Update rnn_functor.h * remove utils * remove utils * remove utils * remove utils * remove utils * Update rnn_functor.h * Update unsqueeze_op.h * Update utils.h * roll back * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * Update rnn_kernel.cc * Update rnn_grad_kernel.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * add TensorToVector * roll back * Update tensor_utils.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update tensor_utils.h * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * TensorCopySync to phi::Copy * fix codestyle * rnn_kernel.cc: add ; * replace all GetDataFromTensor with phi::GetVectorFromTensor * delete include of util.h
-
由 Bo Zhang 提交于
* conflict * add UpdateSliceAttrs
-
由 Yiqun Liu 提交于
-
由 zhouweiwei2014 提交于
-
由 wangshengxiang 提交于
* [XPU] bind op scatter_nd_add * [XPU] add more data type for op: clip, transpose2 & assign_value
-
由 shaojie_wang 提交于
* register bfloat16 datatype for squared l2 norm * register bfloat16 datatype for softmax with upper triangular mask * register bfloat16 for tril triu cuda kernel
-
- 26 2月, 2023 1 次提交
-
-
由 limingshu 提交于
* implement of matmul using cublasLt instead of cublas * Update matmul_kernel_impl_via_blasLt.h --------- Co-authored-by: Nzhangbopd <1299246947@qq.com> Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com> Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 25 2月, 2023 1 次提交
-
-
由 zyfncg 提交于
* rename elementwise_heaviside to heaviside * delete __init__.py * fix bug
-
- 24 2月, 2023 5 次提交
-
-
由 yunyaoXYY 提交于
-
由 niuliling123 提交于
-
由 YuanRisheng 提交于
-
由 xiaoguoguo626807 提交于
* support prim test in OpTest * fix cmake * fix op test * fix test_input_spec * disable cinn in reduce_sum unit test * add bfloat16 dtype for sum * add approve rules * polish code * add clear jit program function * convert grad out from tensor to numpy * remove unnecessary code * add only_prim flag * fix flag * fix op test * add attr * fix optest comp inplace error * fix op test * fix op test with guard * add initialization of check_comp flag * fix comp inplace error in op test * rename check_comp with check_prim and add bfloat16 dtype convert * rename comp_op_type to prim_op_type * rename comp to prim * remove useless code * skip ci check for only prim * add no_grad_vars and grad_outputs in prim test * fix var_dict * fix op test for only_prim * fix dy2static bugs * polish some code * temp * modify op test * except cinn test * modify bfp16 * modify pad grad * add pad_grad dtype * start cinn part --------- Co-authored-by: NCharles-hit <wanghao107@baidu.com>
-
由 ronnywang 提交于
* [XPU] add expand_grad, isnan, meshgrid kernels * update
-
- 23 2月, 2023 4 次提交
-
-
由 limingshu 提交于
-
由 csy0225 提交于
-
由 Huang Jiyi 提交于
* move fluid generator to phi * move fluid generator to phi * update .gitignore * fix bugs * fix cannot find "glog/logging.h" in "generator.h" * fix bugs
-
由 limingshu 提交于
* first commit * main codes has been developed * fix all bugs * add vectorize input&output * a test for optimization_of_layer_norm_fwd * add some changes * fix memory coalesced access for more optimization. * fix addition ctest error * fix according to ci-approval * remove change on slice
-
- 22 2月, 2023 1 次提交
-
-
由 Shuangchi He 提交于
* Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * pre-commit Signed-off-by: Yulv-git <yulvchi@qq.com> --------- Signed-off-by: Yulv-git <yulvchi@qq.com>
-