- 05 9月, 2023 1 次提交
-
-
由 Haohongxiang 提交于
-
- 31 8月, 2023 1 次提交
-
-
由 ShenLiang 提交于
* add usecache * add p2p cache fix * add cache
-
- 29 8月, 2023 1 次提交
-
-
由 ShenLiang 提交于
-
- 25 8月, 2023 2 次提交
- 19 8月, 2023 1 次提交
-
-
由 ShenLiang 提交于
* add debug information * fix log * fix log * add detach for pp
-
- 10 8月, 2023 1 次提交
-
-
由 zhenhailiu 提交于
* dp and sharding coexist * dp
-
- 09 8月, 2023 4 次提交
-
-
由 Yuang Liu 提交于
* skip CopyOrAdd when tmp grad is None (#55679) * Optim fused linear grad add (#55927)
-
由 niuliling123 提交于
-
由 ShenLiang 提交于
-
由 Yuang Liu 提交于
-
- 08 8月, 2023 1 次提交
-
-
由 sneaxiy 提交于
* make flash attn v1 available * add deps error * refine cmake dependencies * fix cmake error
-
- 07 8月, 2023 2 次提交
-
-
由 umiswing 提交于
* [FlashAttn] add flash randomness control (#52902) * add flash randomness control * fix VLOG undefied * [WIP] Integration flash attention 2 (#55758) * Work for fa-2 padded fwd. Code to be cleaned. * Work for fa2 unpadded fwd. * Work for padded-bwd, dk get small diff on np.random.seed(0) * Anyway I pass paddle's utest, except return softmax without dropout. * Clean code. * Modify interface. * Clean code and add some check. * Easy compile for dev. * Fix ci. * Fix ci-build. * Add std c++17 option again. * Limit max job when compiling fa2. * Remove const_cast * Add fwd params, to be cleaned. * Clean code. * Add bwd params. * Clean code. * Add enforce. * Use v2.0.4 * Pass RNG state to fa2 capi * Fix review. * Add assert * Skip compile for sm less than 80. --------- Co-authored-by: NChitsing KUI <kuizhiqing@msn.com>
-
由 niuliling123 提交于
* Add fused_rope forward op (#54351) * style * more * update ctest * Update legacy_backward.yaml * Update legacy_ops.yaml * Update legacy_ops.yaml * update * update * update for move * Update the rope op according to the comments (#54985) * Update multiary.cc * Update __init__.py * for int64_t and assert * more * remove useless assert first --------- Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
- 02 8月, 2023 3 次提交
-
-
由 wuhuachaocoding 提交于
-
由 ShenLiang 提交于
* fix bug * fix bug * fix bug * fix bug * fix bug
-
由 WangZhen 提交于
* Fix test_resnet and test_resnet_v2 ut * Remove ut
-
- 26 7月, 2023 1 次提交
-
-
由 ShenLiang 提交于
* Add virtual pp and dp overlap * add sharding/dp overlap * add dp/vpp overlap * fix code * fix log
-
- 22 7月, 2023 2 次提交
- 21 7月, 2023 1 次提交
-
-
由 Tian 提交于
* add paddle.async_save to reduce time cost by checkpoint saving * adapt save_for_auto_inference to paddle.async_save * modify UT * modify UT * fix on cpu only version * revert commit on save_auto_inference * fix threading
-
- 18 7月, 2023 2 次提交
-
-
由 zhenhailiu 提交于
* new_frl_shard_redece * add mp guard * add test
-
由 lzy 提交于
* make top_p_sampling supports threshold * delete __nv_bfloat16
-
- 17 7月, 2023 1 次提交
-
-
由 ShenLiang 提交于
-
- 15 7月, 2023 1 次提交
-
-
由 sneaxiy 提交于
* fix new launch * fix ps uit
-
- 13 7月, 2023 2 次提交
- 12 7月, 2023 2 次提交
- 05 7月, 2023 1 次提交
-
-
由 sneaxiy 提交于
* refine dygraph_sharding_optimizer.py by sorting parameters * Update dygraph_sharding_optimizer.py Make FLAGS_sharding_sort_parameters=1 by default.
-
- 04 7月, 2023 2 次提交
- 30 6月, 2023 1 次提交
-
-
由 sneaxiy 提交于
-
- 29 6月, 2023 2 次提交
-
-
由 ShenLiang 提交于
-
由 pangengzheng 提交于
* support add(x_float32, bfloa16_) or add(x_float32, y_float16) * polisg
-
- 28 6月, 2023 2 次提交
- 27 6月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 21 6月, 2023 1 次提交
-
-
由 zhenhailiu 提交于
-
- 19 6月, 2023 1 次提交
-
-
由 ShenLiang 提交于
* add p2p calc stream * rm code * rm code * rm assert * rm code
-