- 25 5月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 24 5月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas. * Change the repeat of cublaslt to 10. * Use FLAGS_cublaslt_exhaustive_search_times as repeats. * Fix compiling error on CI. * Polish the key and simplify codes.
-
- 23 5月, 2023 11 次提交
-
-
由 Fisher 提交于
* Enable check_cinn on some tests Tests: bitwise, compare, shape, assign_value, sum, expand_v2, lookup_table, lookup_table_v2 * Enable more CINN tests Tests with CINN: expand_v2, matmul, matmul_v2, mul, norm, one_hot_v2 Add target select in cinn_launch_op * Revert test_mul_op * Improve op unit tests
-
由 LiYuRio 提交于
-
由 gouzil 提交于
* [phi] autogen code tril_triu * [phi][api]fix tril_triu_grad args * [fluid] clean cmake; [phi] fix infer_meta
-
由 co63oc 提交于
-
由 cyberslack_lee 提交于
-
由 huangjiyi 提交于
* update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update HostAlloc * update param name * update cpu kernel * remove kernel header * update * update
-
由 huangjiyi 提交于
* update * update * update * set out dtype
-
由 Wang Xin 提交于
* static graph autogen code support for pad3d op * bug fixed * add ut for pad3d mkldnn op * fix coverage * fix bug * fix bug * Delete test_pad3d_mkldnn_op.py
-
由 ronnywang 提交于
* [CustomDevice] fix auto_paralell * update * update * update
-
由 LoneRanger 提交于
* fix the static op generation for group_norm * fix bug of mismatch * fix bug of AssertionError * fix setting of composite
-
由 HongyuJia 提交于
* [0D-Tensor] Support elementwise_add * support elementwise_add ZeroDim2&3
-
- 22 5月, 2023 3 次提交
-
-
由 risemeup1 提交于
* update_c++14_to_c++17_on_windows * disable test_audio_logmel_feature and test_audio_mel_feature
-
由 Yuanle Liu 提交于
[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt (#52485)
-
由 Tian Zheng 提交于
* Add GPU kernel for multiclass_nms3 op * Make multiclass_nms3 gpu kernel output consistent with cpu kernel * Fix API incompatibility * Fix unittests on builds without CUDA * Fix ROCM build * Remove fluid headers; Use default atol for unittest * Change function and variable naming * Add comments; Reduce redundant code * Use paddle test framework
-
- 19 5月, 2023 5 次提交
-
-
由 warrentdrew 提交于
* add minimum grad composite rules * add public python api * fix format * fix format * update testcase * fix testcase * fix format * fix cmakelist.txt * fix format * fix param problem * fix op and composite rule * fix bf16 cpu support problem * fix bf16 cpu issue * fix axis error log * add axis for maximum * revert commit * remove .orig * fix generic problem * revert max op * fix axis error * fix maximum axis * fix test_check_output * fix cinn * fix minimum maximum axis check
-
由 limingshu 提交于
* Reorganize the forward codes of flash-attention. * Fix forward. * Remove some noused codes. * Simplify codes and fix backward. * Change all LOG(INFO) to VLOG and fix the backward. * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes * decrease the effect of debug print on performance * Unify the initialize of flashattn arguments. * Rewirte the reshape of temp_mask and temp_bias. * API support use_flash_attn. * Fix compiling error on CI. * Try to crop the flash-attention lib. * Correct the condition of whether can use flash-attn. * Remove the softmax_out argument. * Remove is_causal. * Polish codes. * Fix qkv_transpose_out's shape and scaling of Q * K. * Update commit of flash-attention. --------- Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 Galaxy1458 提交于
-
由 xiaoguoguo626807 提交于
* review * modify opcompat bug * modify pybind
-
由 ronnywang 提交于
-
- 18 5月, 2023 6 次提交
-
-
由 Hulek 提交于
* Fused elementwises kernels and ops * change fuse pass name * adjust .pbtxt files * adjust quantization attributes * add missing arguments and fix others, review fixed * simplify fused kernel registration * fix elementwise unit tests * reuse one fused elementwise op * adjust proto * Add supported datatypes * Change 'Scale' to 'scale' in tests, change some tests to onednn * Revert breaking changes * Fix unit tests * Delete obsolete test cases * Delete commented out code * Fix codestyle * delete temporary condition * fix conflicts and delete duplicate fusing * Fix code after merge * Move tests to new directory * fix tests volatility * Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py * Update CMakeLists.txt add mkldnn op test --------- Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
-
由 huangjiyi 提交于
-
由 Wang Xin 提交于
* move sequence_mask op InferShape func * add dtype infer
-
由 co63oc 提交于
-
由 RedContritio 提交于
* simplify layer_norm_op.cc * support auto generate for op layer_norm * update unittest for composite_layer_norm * remove layer_norm_op.cc from scripts * replace layer_norm_op with generated_op * add get_expected_kernel for layer_norm * update cmake kernel register function for layer_norm_mkldnn_op
-
由 co63oc 提交于
-
- 17 5月, 2023 1 次提交
-
-
由 gouzil 提交于
-
- 16 5月, 2023 9 次提交
-
-
由 Galaxy1458 提交于
* test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop
-
由 xiaoguoguo626807 提交于
* add rules * modify no kernel yaml parse * success op generate * success test_silu_double * modify bug * modify static error * modify silu_grad input * modify kernel signature * modify kernel signature * code style * code style * review * delete opinfo modify * modify gradOpMaker * modify gradOpMaker * modify genarated-j2 * add approve rules * modify aytograd_functional_static_test
-
由 huangjiyi 提交于
* update * fix bug * test * test * update * update mutable_data * fix bug * update * fix bug * update output type reg * update * update
-
由 张春乔 提交于
* rm npu * rm use_npu * rm npuid * rm use_npu * rm npuid * delete npupinned * roll back sth. * roll back sth. * delete npupinned * roll back sth. * roll back sth. * rm npu * rollback something * rollback npu identity * rollback npu identity
-
由 Sonder 提交于
* trans fused batch norm Compute function * trans batch norm register info to phi * trans fused batch norm grad Compute * trans batch norm grad register info * add sig file * update sig file * Update fused_bn_activation_kernel.cu * Update fused_bn_activation_grad_kernel.cu * fix * Rename fused_bn_activation_kernel_grad.cu to fused_bn_activation_kernel.cu * fix * fix * fix CudnnDataType error * fix * fix include * update * add #if * add fused bn act to cmakelist.txt * update cmakelist * fix #ifdef error * add timeout set * add env set * fix * fix * Update fused_bn_activation_sig.cc
-
由 Wang Xin 提交于
* static graph autogen code support for softmax op * bug fixed * fix PR-CI-Windows error * fix CI error * bug fixed * fix conflicts
-
由 cyberslack_lee 提交于
-
由 张春乔 提交于
* mv InstanceNorm * modify op_version.yaml * modify add Operator:: in get_expected_kernel_func.cc * rm gradexpectedkernel * add extra * add float epsilon=1e-5
-
由 gouzil 提交于
* [phi]mv StftKernel to phi * [phi] fix KernelSignature * [phi]fix arr error * [phi] Disable check_dygraph * [phi]fix include * [phi] rewrite mutable_data, add output register * [phi] fix Alloc * [phi] fix Alloc again * [phi] fix mutable_data * [phi] fix onesided_out Resize
-
- 15 5月, 2023 3 次提交
-
-
由 huangjiyi 提交于
* update * fix bug * fix output type def
-
由 ronnywang 提交于
-
由 xiaoguoguo626807 提交于
* add rules * modify no kernel yaml parse * success op generate * success test_silu_double * modify bug * modify static error * modify silu_grad input * modify kernel signature * modify kernel signature * code style * code style * review * delete opinfo modify
-