- 30 4月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 28 4月, 2023 2 次提交
-
-
由 Meteor Liu 提交于
-
由 Roc 提交于
To make it synchronized at the first recv operator. If warping all send and recv operators with group start and end, the received tensor will be not complete.
-
- 27 4月, 2023 1 次提交
-
-
由 houj04 提交于
* [XPU] remove scale_loss in parallel.py * [XPU] throw Unimplemented when using Reducer
-
- 26 4月, 2023 3 次提交
-
-
由 zhouweiwei2014 提交于
-
由 zhenhailiu 提交于
* polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish
-
由 ShenLiang 提交于
-
- 25 4月, 2023 2 次提交
-
-
由 wuhuachaocoding 提交于
-
由 Chitsing KUI 提交于
* print modifed flags * fix ref, opt print * fix default getter * fix ut
-
- 24 4月, 2023 5 次提交
-
-
由 zhouweiwei2014 提交于
-
由 kangguangli 提交于
* fix bug: wrong match between depend and c_allreduce_sum * fix codestyle * fix bug * add c_sync_calc_stream back * fix * revert * use flag to control * fix for code coverage
-
由 张春乔 提交于
-
由 张春乔 提交于
-
由 zqw_1997 提交于
* test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * fix doc erros, test=allcase
-
- 23 4月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* save env log for each worker * fix ut
-
- 22 4月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 21 4月, 2023 3 次提交
- 20 4月, 2023 3 次提交
-
-
由 Chitsing KUI 提交于
* add flash randomness control * fix VLOG undefied
-
由 zhouweiwei2014 提交于
-
由 Wang Xin 提交于
* remove ASCEND* keyword * update docstring * bug fixed * bug fixed
-
- 19 4月, 2023 2 次提交
-
-
由 ronnywang 提交于
* [CustomDevice] add recompute support * update
-
由 kangguangli 提交于
* fix * fix * fix * fix * fix * fix fuse group order
-
- 18 4月, 2023 2 次提交
-
-
由 张春乔 提交于
-
由 Meteor Liu 提交于
* rename _varbase_creator as create_tensor * rename _varbase_creator as create_tensor
-
- 17 4月, 2023 7 次提交
-
-
由 Yulong Ao 提交于
-
由 LiYuRio 提交于
* cherry-pick fleet executor from 2.4 * fix test case
-
由 Chitsing KUI 提交于
* add random control for fused dropout add * add __init__
-
由 Kim Yann 提交于
-
由 张春乔 提交于
* remove hccl in .py files * remove ascend in setup.py.in * remove ascend in setup.py
-
由 Haohongxiang 提交于
-
由 caozhou 提交于
* add o2 tune * add unittest * fix error * set unittest timeout
-
- 14 4月, 2023 2 次提交
-
-
由 Feiyu Chan 提交于
1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408) 2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition); 3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version; 3. provide an option `legacy_format=false` in serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute; 4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.
-
由 ronnywang 提交于
-
- 13 4月, 2023 1 次提交
-
-
由 TaoTao Li 提交于
* add auto parallel tuner options in launch * add ut for launch in auto_parallel tuner fix code format * fix ci-converage
-
- 12 4月, 2023 4 次提交
-
-
由 ShenLiang 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Speedup the completion process * [Auto Parallel] Skip the property of dist_context when deepcopying * [Auto Parallel] Remove the unnecessary print * [Auto Parallel] Move some changes from 2.4 branch to develop * Update engine.py * [Auto Parallel] Fix a bug
-
由 张春乔 提交于
* remove c_comm_init_hccl_op.cc and c_gen_hccl_id_op.cc * remove gen_hccl_id_op.cc
-
由 CHANGer 提交于
-