- 19 5月, 2023 20 次提交
-
-
由 limingshu 提交于
* Reorganize the forward codes of flash-attention. * Fix forward. * Remove some noused codes. * Simplify codes and fix backward. * Change all LOG(INFO) to VLOG and fix the backward. * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes * decrease the effect of debug print on performance * Unify the initialize of flashattn arguments. * Rewirte the reshape of temp_mask and temp_bias. * API support use_flash_attn. * Fix compiling error on CI. * Try to crop the flash-attention lib. * Correct the condition of whether can use flash-attn. * Remove the softmax_out argument. * Remove is_causal. * Polish codes. * Fix qkv_transpose_out's shape and scaling of Q * K. * Update commit of flash-attention. --------- Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 limingshu 提交于
-
由 RedContritio 提交于
-
由 gouzil 提交于
* [tools] add PADDLE_API check file diff approvals * [tools] fix determine * [tools] fix determine * [tools] Change to full character matching Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com> * [tools] Update echo_line Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com> * [tools] Update check_approval Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com> --------- Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>
-
由 GGBond8488 提交于
* remove user define grad * fix errors * remove unused self.x_grad, self.out_grad
-
由 Zhang Zheng 提交于
* Add large dim test of log_softmax * fix
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 xiaoguoguo626807 提交于
* review * modify opcompat bug * modify pybind
-
由 Charles-hit 提交于
-
由 Danyang Zhang 提交于
* delete bf16 of cross entropy * delete bf16 of cross entropy
-
由 zhoutianzi666 提交于
* decrease_peak_memory
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 Galaxy1458 提交于
-
由 ronnywang 提交于
-
由 zhangyuqin1998 提交于
-
- 18 5月, 2023 20 次提交
-
-
由 houj04 提交于
-
由 Galaxy1458 提交于
-
由 Charles-hit 提交于
* add meshgrid,expand_as, prod and grad bf16 kernel * fix bf16 for optest * modify code style * fix amp test
-
由 PuQing 提交于
* fix parameter not passed * fix repr
-
由 risemeup1 提交于
* ignore third_party * modify .gitmodules * test=document_fix
-
由 tianshuo78520a 提交于
* test=document_fix * test=document_fix
-
由 HongyuJia 提交于
* [CINN] Fix TestGelu unittest of CINN * pass if_enable_cinn
-
由 Yuanle Liu 提交于
-
由 co63oc 提交于
-
由 engineer1109 提交于
-
由 Hulek 提交于
* Fused elementwises kernels and ops * change fuse pass name * adjust .pbtxt files * adjust quantization attributes * add missing arguments and fix others, review fixed * simplify fused kernel registration * fix elementwise unit tests * reuse one fused elementwise op * adjust proto * Add supported datatypes * Change 'Scale' to 'scale' in tests, change some tests to onednn * Revert breaking changes * Fix unit tests * Delete obsolete test cases * Delete commented out code * Fix codestyle * delete temporary condition * fix conflicts and delete duplicate fusing * Fix code after merge * Move tests to new directory * fix tests volatility * Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py * Update CMakeLists.txt add mkldnn op test --------- Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
-
由 huangjiyi 提交于
-
由 co63oc 提交于
-
由 co63oc 提交于
-
由 co63oc 提交于
-
由 LoneRanger 提交于
-
由 Wang Xin 提交于
* move sequence_mask op InferShape func * add dtype infer
-
由 co63oc 提交于
-
由 tianshuo78520a 提交于
* fix * fix
-