- 27 2月, 2023 32 次提交
-
-
由 张春乔 提交于
* remove utils * remove utils * remove utils * remove utils * Update get_data_from_tensor.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_functor.h * Update rnn_kernel.cu.cc * Update rnn_kernel.cc * remove utils * Update rnn_functor.h * remove utils * remove utils * remove utils * remove utils * remove utils * Update rnn_functor.h * Update unsqueeze_op.h * Update utils.h * roll back * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * Update tensor_utils.h * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * use TensorToVector * Update rnn_kernel.cc * Update rnn_grad_kernel.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * add TensorToVector * roll back * Update tensor_utils.h * Update rnn_functor.h * Update rnn_grad_kernel.cu.cc * Update tensor_utils.h * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * Update rnn_grad_kernel.cu.cc * Update rnn_kernel.cu.cc * Update rnn_grad_kernel.cc * Update rnn_kernel.cc * TensorCopySync to phi::Copy * fix codestyle * rnn_kernel.cc: add ; * replace all GetDataFromTensor with phi::GetVectorFromTensor * delete include of util.h
-
由 chenxujun 提交于
-
由 Maple Xie 提交于
* Fix fp16 dtype checking for AvgPool1D op * Update code style for PR-CI-Static-Check
-
由 Wang Bojun 提交于
* add sm version check * use GetGPUComputeCapability
-
由 张春乔 提交于
-
由 HongyuJia 提交于
* [Tensor Operants & Prim] Tensor pow API uses elementwise_pow * unittest change to fill_constant+elementwise_pow
-
由 张春乔 提交于
* support fp16 on AvgPool3D * Apply suggestions from code review
-
由 张春乔 提交于
-
由 张春乔 提交于
-
由 张春乔 提交于
-
由 张春乔 提交于
-
由 张春乔 提交于
-
由 haozi 提交于
* fix fp16 dtype checking for clip op * modify the name * fix type error * fix check error * Update test_clip_op.py fix test error * Update test_clip_op.py fix code style --------- Co-authored-by: NZhang Ting <Douyaer2020@qq.com>
-
由 Infinity_lee 提交于
-
由 HongyuJia 提交于
* [Error Msg] Polish error message when GPU kernel not found * Only test in GPU environment
-
由 Zhang Ting 提交于
* fix fp16 dtype checking for argmax op * run fp16 test when place is gpu * Update search.py fix doc
-
由 Ainavo 提交于
-
由 陈沧夜 提交于
-
由 张春乔 提交于
* add float16 in python/paddle/math * add unittest for float16 * add float16 support in python.paddle.tensor.search.where * remove fp16 error cases * Add NotImplementedError unittest * fix codestyle * fluid to paddle.static; add cases with GPU * Add float16 in English docs
-
由 Bo Zhang 提交于
* conflict * add UpdateSliceAttrs
-
由 gaoziyuan 提交于
-
由 csy0225 提交于
-
由 Charles-hit 提交于
-
由 jameszhang 提交于
* [kunlun] support reduce_scatter * uncomment unittest * update xccl to 1.0.10
-
由 Yiqun Liu 提交于
-
由 zhouweiwei2014 提交于
-
由 zhangbo9674 提交于
* add TypeUniquer and IrContext * refine include code * add Type, TypeBase * add built-in type * add bulit-in Float32Type * refine ut * refine code * refine code * delete type_base * rename ImplType to StorageType * rename ImplType to StorageType * add macros util for register type * add macros util for register type * refine name * refine name * change storage manager * add multi_thread for ir_ctx * rwlock_2_spinlock, add REGISTER_TYPE_2_IRCONTEXT * DECLARE_TYPE_UTILITY_FUNCTOR * refine ircontext singleton * del destructor for ParametricStorageManager * refine code * Add necessary logs for debugging * refine ir_context instance * refine type get interface * refine code by comment
-
由 wangshengxiang 提交于
* [XPU] bind op scatter_nd_add * [XPU] add more data type for op: clip, transpose2 & assign_value
-
由 zhaoyingli 提交于
* fix dist_attr in data_parallel in optimization * fix grad_clip pass when pp2 * fix dist_attr
-
由 shaojie_wang 提交于
* register bfloat16 datatype for squared l2 norm * register bfloat16 datatype for softmax with upper triangular mask * register bfloat16 for tril triu cuda kernel
-
由 wangzhen38 提交于
* [mv fleet] mv fleet to distributed * [mv fleet] for ci * [mv fleet] for ci * [mv fleet] solve ci of version
-
由 zhaoyingli 提交于
* fix set_grad_var_shape * recover modify
-
- 26 2月, 2023 2 次提交
-
-
由 limingshu 提交于
* implement of matmul using cublasLt instead of cublas * Update matmul_kernel_impl_via_blasLt.h --------- Co-authored-by: Nzhangbopd <1299246947@qq.com> Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com> Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 Yiqun Liu 提交于
* Enable matmul + bias fusion in fused_gat_attention. * Add a variable to control whether using fused matmul + bias.
-
- 25 2月, 2023 3 次提交
-
-
由 zhouweiwei2014 提交于
-
由 Vvsmile 提交于
* change outputs and grads from fp16-fp16-comparision and fp16-fp32 comparision * support grad comparision fp16-fp32 * the change of reference dtype only occured from np.float16 to np.float32 * fix the list type can not infer the dtype by attribute dtype by transfer the list to array * adjust the default atol and rtol of float16 to 1e-3 * Polish code * fix error * fix * Polish code * fix the _is_cal_ref and np.float16 * fix the combination of is_calc_ref and np.float16 * remove unuseful codes in op_test.py * fix ci * fix the rtol set in the dygraph checker and eager checker --------- Co-authored-by: NZzSean <18818272991@163.com>
-
由 zyfncg 提交于
* rename elementwise_heaviside to heaviside * delete __init__.py * fix bug
-
- 24 2月, 2023 3 次提交
-
-
由 yunyaoXYY 提交于
-
由 chenxujun 提交于
-
由 Weilong Wu 提交于
* Revert "fixoptminizer _set_auxiliary_var bug (#50335)" This reverts commit c44005f0. * Revert "refine optimizer create accumulators (#50188)" This reverts commit 244e7546. * Revert "fix found_inf bug for custom optimizer (#50158)" This reverts commit 64573f9f. * Revert "refine amp scaler found_inf (#49864)" This reverts commit 382e9a06. * fix code format * fix conflict
-