- 06 9月, 2023 3 次提交
-
-
由 小飞猪 提交于
[xdoctest][task 248-249,266-267,269] reformat example code with google style in `incubate/distributed/fleet/*`,`incubate/nn/layer/*` (#56772) * [Doctest]fix No.248-249,266-267,269, test=docs_preview * fix style * fix * add env:DISTRIBUTED
-
由 小飞猪 提交于
[xdoctest][task 268] reformat example code with google style in `/incubate/nn/layer/fused_transformer.py` (#56965) * [Doctest]fix No.268, test=docs_preview * Apply suggestions from code review --------- Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
-
由 cyberslack_lee 提交于
* test=docs_preview * test=docs_preview * test=docs_preview * test=docs_preview * test=docs_preview * test=docs_preview * fix * test=docs_preview * test=docs_preview * fix * move stmts under imports --------- Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 05 9月, 2023 1 次提交
-
-
由 KongAKun 提交于
* Fix styles of code * update the GPU option * add the GPU setup * remove the note * update the code
-
- 04 9月, 2023 1 次提交
-
-
由 tianhaodongbd 提交于
* add rotate_half in fused_rope * add position_ids in fused_rope * modified examples about fused_rope * add set_device in examples
-
- 31 8月, 2023 1 次提交
-
-
由 yuchen202 提交于
-
- 25 8月, 2023 1 次提交
-
-
由 xiaoxiaohehe001 提交于
* add_bias_and_simplify_mmha
-
- 21 8月, 2023 1 次提交
-
-
由 RichardWooSJTU 提交于
-
- 16 8月, 2023 1 次提交
-
-
由 MarDino 提交于
* refine static op return val
-
- 15 8月, 2023 1 次提交
-
-
由 xiaoxiaohehe001 提交于
* support_mmha * add_python_api * add_api_doc * fix_doc_error * fix_infermeta * add_infermeta * add_bf16_cuda_check * add_bf16_check * fix_ci_windows * fix_ci_windows_kernel_register * fix_test_mmha * add_cumoffsets * remove_bias * delete_mmha_reshape_input_output * rename_delete_hfile * remove_fluid --------- Co-authored-by: Nyangjianfengo1 <yangjianfeng01@baidu.com>
-
- 14 8月, 2023 1 次提交
-
-
由 MarDino 提交于
* add rmsnorm residual bias add and quant * refine python interface * add rmsnorm unittest * Add layernorm * fix layernorm unittest * refine unittest * fix example code * fix review comment
-
- 10 8月, 2023 1 次提交
-
-
由 lzy 提交于
* add variable_length_memory_efficient_attention * update variable_length_memory_efficient_attention unittest * update variable_length_mem_eff_attn's docs and unittest * update variable_length_mem_eff_attn's docs * Update test_variable_length_memory_efficient_attention.py * Update variable_length_memory_efficient_attention.cu * fix codestyle * fix variable_length_fmha's docs and unittest * fix variable_length_fmha's docs
-
- 09 8月, 2023 1 次提交
-
-
由 niuliling123 提交于
-
- 26 7月, 2023 2 次提交
-
-
由 tianhaodongbd 提交于
-
由 JYChen 提交于
* remove api staticrnn * move select_input/output to static/controw flow * delete some func, only remain Switch * clean fluid.layers.controw_flow * remove fluid.layers.controlflow * fix conditional_block ut
-
- 20 7月, 2023 1 次提交
-
-
由 niuliling123 提交于
-
- 11 7月, 2023 1 次提交
-
-
由 MarDino 提交于
* add rmsnorm kernel * add static graph test * fix round type * use alignas to avoid msvc compile error * remove redundant headerfile to avoid rocm compile error * fix rocm compile not found cub * Add document
-
- 03 7月, 2023 1 次提交
-
-
由 niuliling123 提交于
-
- 29 6月, 2023 1 次提交
-
-
由 niuliling123 提交于
* style * more * update ctest * Update legacy_backward.yaml * Update legacy_ops.yaml * Update legacy_ops.yaml * update * update * update for move
-
- 12 6月, 2023 1 次提交
-
-
由 Nyakku Shigure 提交于
-
- 09 6月, 2023 1 次提交
-
-
由 Nyakku Shigure 提交于
* bump ruff to 0.0.271 and update config * exclude third_party * bump ruff to 0.0.272 * refine config
-
- 23 5月, 2023 1 次提交
-
-
由 cyberslack_lee 提交于
-
- 22 5月, 2023 1 次提交
-
-
由 Meteor Liu 提交于
* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * fixed cyclic reference that caused patial import * fixed bad change * fix bad import * fix bad import * fix bad import * fix ut failed caused by change in_dynamic_mode * fix ut failed caused by change in_dynamic_mode * fixed usage of in_dynamic_mode() or in_dygraph_mode() * revert python3 to python in .pre-commit-config.yaml * fix merge conflicts
-
- 19 5月, 2023 1 次提交
-
-
由 limingshu 提交于
* Reorganize the forward codes of flash-attention. * Fix forward. * Remove some noused codes. * Simplify codes and fix backward. * Change all LOG(INFO) to VLOG and fix the backward. * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes * decrease the effect of debug print on performance * Unify the initialize of flashattn arguments. * Rewirte the reshape of temp_mask and temp_bias. * API support use_flash_attn. * Fix compiling error on CI. * Try to crop the flash-attention lib. * Correct the condition of whether can use flash-attn. * Remove the softmax_out argument. * Remove is_causal. * Polish codes. * Fix qkv_transpose_out's shape and scaling of Q * K. * Update commit of flash-attention. --------- Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 06 5月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fused_gate_attention API. * Implement FusedDropout API. * Fix doc and add unittest. * Skip for non-gpu device. * Add unittest.
-
- 17 4月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* add random control for fused dropout add * add __init__
-
- 31 3月, 2023 1 次提交
-
-
由 张春乔 提交于
* autofix Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com> * revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py * empty commit, trigger ci * fix test_slice --------- Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 29 3月, 2023 1 次提交
-
-
由 sneaxiy 提交于
* fix generate_kernels.py in CUDA 12.0 * fix attrs bug
-
- 24 3月, 2023 1 次提交
-
-
由 ZhangDY-6483 提交于
* first version, notest * return final rst, notest * use infinity() instead of max * ut structure * start up of ut * generate lse * update * add depense * reconstruct cmake * move file * add memory efficient attention and fix blasimpl * update * update cmake * add namespace * update cmake * use .cu * update for pad3d * bug fix * bug fix * update * bug fix * update enforce * add test case * merge the lse pad * fix kernel_fn of backward * fix PADDLE_ENFORCE_EQ and phi_api * fix PADDLE_ENFORCE * fix PADDLE_ENFORCE * rerun coverage * fix memory efficient attention test * rerun ci * add cuda version condition * add cuda version condition * delete WIP test * replace PADDLE_ENFORCE * edit the namespace of datatype in multiple.cc * rerun * rerun --------- Co-authored-by: Nliuyuang <liuyuang@baidu.com>
-
- 23 3月, 2023 1 次提交
-
-
由 PuQing 提交于
[CodeStyle][C408][C409][C410] Fix unnecessary <dict/list/tuple> call and unnecessary <list/tuple> passed to <list/tupule>() (#51928) * autofix * add select config * autofix C410 * add C410 select
-
- 22 3月, 2023 1 次提交
-
-
由 ShenLiang 提交于
-
- 17 3月, 2023 1 次提交
-
-
由 qizhaoaoe 提交于
* fluid clean: remove fluid.ir to framework.ir and some funcs form fluid.layer.io to incubate. * delete fluid.ir
-
- 10 3月, 2023 1 次提交
-
-
由 sneaxiy 提交于
* add attn_bias.py * add Python interface * add license * add test_attn_bias.py * fix CPU test error * fix ci error
-
- 22 2月, 2023 1 次提交
-
-
由 Shuangchi He 提交于
* Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * pre-commit Signed-off-by: Yulv-git <yulvchi@qq.com> --------- Signed-off-by: Yulv-git <yulvchi@qq.com>
-
- 15 2月, 2023 1 次提交
-
-
由 lzy 提交于
* make FusedMultiTransformer supports variable-lengths. * modify ffn2 when cuda_version >= 11.6 because of #49392. * code style * delete remove_padding
-
- 01 2月, 2023 1 次提交
-
-
由 RedContritio 提交于
* add shape check for fused_multi_head_attention * use raise for coverage test * add unittest * remove unnecessary pass * add unittest
-
- 05 1月, 2023 2 次提交
- 23 12月, 2022 1 次提交
-
-
由 lzy 提交于
-
- 22 12月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
-