- 22 8月, 2023 4 次提交
-
-
由 PommesPeter 提交于
* fix: updated code examples. * fix: added paddle.seed * fix: updated code style * Apply suggestions from code review * refactor: refine detail of code examples * Update python/paddle/distributed/auto_parallel/static/process_mesh_v2.py * fix: refine detail * fix: refine detail * Update python/paddle/distributed/auto_parallel/static/process_mesh_v2.py Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * refactor: refine detail * refactor: refine detail * fix: refine doc --------- Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
-
由 张春乔 提交于
-
由 lzydev 提交于
* optimize the memory * fix bug in static_build.cc * fix bug when using logging * change the static_build * fix bug in windows * fix code accordding to review
-
由 caozhou 提交于
-
- 21 8月, 2023 2 次提交
-
-
由 Ghost Screaming 提交于
-
由 RichardWooSJTU 提交于
-
- 19 8月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 18 8月, 2023 1 次提交
-
-
由 Leo Chen 提交于
* remove empty block program * update implementation
-
- 17 8月, 2023 1 次提交
-
-
由 Kai Song 提交于
* [Custom Dice]add run_check support for custom device * fix error msg * fix typo * update for all custom device * fix * add warning msg
-
- 16 8月, 2023 2 次提交
-
-
由 Ghost Screaming 提交于
* [WIP] Add mp_all_reduce asynchronize overlap. * Fix some problems. * Fix dw compute bug, and use a temporary solution to achieve overlap. * Use fused_linear_param_grad_add to compute dw. * Reformat ColumnParallel _overlap_linear. Use environment flags to control following behaviors: 1. export Flags_mp_aysnc_allreduce=True to turn on mp async all_reduce 2. export Flags_skip_mp_c_identity=True to skip two c_identity operators in dygraph mode. 3. export Flags_fused_linear_param_grad_add to enable fused_linear_param_grad_add in ColumnParallel backward with mp async all_reduce. * Polish code. * Remove useless communication API. * Fix some problems in mp_async_all_reduce and skip_c_identity. * Add test cases. * Remove environment variable Flags_fused_linear_param_grad_add in test case. * Reset error threshold. * Reset threshold in test case. * Add useful log. Remove useless test cases.
-
由 zhaoyingli 提交于
* make params_grads order same bewteen dynamic and static mode * revert inplace clip * use sorted attribute to control * tiny fix * fix find loss_grad_op
-
- 15 8月, 2023 1 次提交
-
-
由 lzydev 提交于
* Improve GC for pipeline parallel * Delete print * fix bug of nop_op and sharding --------- Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
-
- 14 8月, 2023 5 次提交
-
-
由 张春乔 提交于
* input.py * Update python/paddle/nn/functional/input.py * Update input.py * Update all_gather.py * Update all_gather.py
-
由 张春乔 提交于
-
由 Azure 提交于
* temp commit * distribute best cfg * update metric extracting * fix bugs of prune and reading log * fix adding cfg bug * reset status * remove alarm and set logdir * deepcopy ctx * change alarm * fix restart bug * best no need alarm * add gbs search, add gpu memory to history csv, add memory detect * fix bug * fix memory read bug; fix etcd connection bug * fix memory read bug, add oom detection for all ranks * fix read log and oom detaction, add error code for read log * add unit test * Update master.py --------- Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
-
由 张春乔 提交于
-
由 张春乔 提交于
-
- 11 8月, 2023 4 次提交
-
-
由 LoneRanger 提交于
* remove the optimizer base and learning rate base * fix bug * fix bug
-
由 kangguangli 提交于
-
由 Difer 提交于
repacle fluid.io.load_inference_model, fluid.io.save_inference_model in fluid with 2.0 version (#55345) * repacle fluid.io.load_inference_model * replace fluid.io.save_inference_model * fix some bug * fix some bugs of load & save model * fix some bug * fix test_inference_model_io bug * fix word2vec_inference_model bug * fix some bug * fix valueError bug * fix some bug * fix a warning error * for debug * for debug * fix io error * fix test_wordvec_book error * remove debug print * fix load_var bug * for debug cinn test * revert cinn & fix inference_pass_test in windows * fix some bugs * revert cinn & fix inference_pass_test in windows * for debug vars * for debug * fix quant_dequant_test * fix some path errors * remove fluid save/load * fix incubate-fleet save * move some from fluid.io to static.io
-
由 Difer 提交于
* move fluid apis * fix type error * remove static exponential_decay * fix some import error * remove nn.py * fix some error * fix type error
-
- 09 8月, 2023 2 次提交
-
-
由 LoneRanger 提交于
remove the AdamOptimizer、SGDOptimizer、MomentumOptimizer、ModelAverage、LookaheadOptimizer、FtrlOptimizer、DecayedAdagradOptimizer、DpsgdOptimizer in fluid and relocate the ExponentialMovingAverage、PipelineOptimizer、GradientMergeOptimizer and change optimizer base for LarsMomentumOptimizer and RecomputeOptimizer (#55970) * change the optimizer base for SGDOptimizer * change the optimizer base for SGDOptimizer * replace the SGDOptimizer with SGD * fix bug of sgd * change the optimizer base for MomentumOptimizer * fix the remaining tests * remove the Momentum in fluid/optimizer.py * fix bug * fix bug * fix bug * fix bug * Update test_resnet_cinn.py * Update test_resnet_prim_cinn.py * fix bug * fix bug * fix bug * remove the ModelAverage in fluid * remove the LookaheadOptimizer in fluid * fix bug * remove AdamOptimizer in fluid * Update test_image_classification_fp16.py * fix bug * relocate the ExponentialMovingAverage in fluid * restore the static api * remove the FtrlOptimizer in fluid * remove the DecayedAdagradOptimizer in fluid * remove the DpsgdOptimizer in fluid * fix bug * fix codestyle * fix bug * fix bug * relocate the PipelineOptimizer * relocate the GradientMergeOptimizer * fix bug * fix bug * fix bug * fix doc * Update __init__.py * Update test_fleet_qat_meta_optimizer.py * change optimizer base for LarsMomentumOptimizer * fix bug * fix conflict * fix code-style * fix sample codes * fix bug * fix bug * fix cinn bug * fix bug * fix bug * Update qat_optimizer.py * Update __init__.py * fix bug * change optimizer base for RecomputeOptimizer * fix bug * fix bug * Update test_imperative_optimizer_v2.py
-
由 Yuang Liu 提交于
-
- 08 8月, 2023 3 次提交
-
-
由 Ruibiao Chen 提交于
* Improve GC for pipeline parallel * Delete print
-
由 Sonder 提交于
* open * update
-
由 Yuang Liu 提交于
-
- 07 8月, 2023 1 次提交
-
-
由 LiYuRio 提交于
* make tcp store a global instance * fix windows compile error
-
- 02 8月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
* Update autoparallel DistributedDataLoader * add places for engine.dataloder()
-
- 01 8月, 2023 3 次提交
-
-
由 Yuang Liu 提交于
-
由 LiYuRio 提交于
* use string as key for comm_context_manager * remove device_id from comm_context
-
由 pangengzheng 提交于
-
- 31 7月, 2023 1 次提交
-
-
由 Difer 提交于
* simple reaplce * for debug * fix bugs * fix some bugs * del fill_constant_batch_size_like
-
- 27 7月, 2023 1 次提交
-
-
由 sneaxiy 提交于
-
- 25 7月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Call multiply_ instead of scale_ to avoid multiple DtoH copy. * Call _squared_l2_norm to calculate grad_clip. * Fix import error.
-
- 24 7月, 2023 4 次提交
-
-
由 jjyaoao 提交于
Signed-off-by: Njjyaoao <jjyaoao@126.com>
-
由 Windfarer 提交于
-
由 Yuang Liu 提交于
-
由 Chen Weihang 提交于
* add shard tensor api * add DistAttr api * add unittest for coverage * fix process mesh sample code * fix checking error
-
- 22 7月, 2023 2 次提交
-
-
由 zhenhailiu 提交于
-
由 sneaxiy 提交于
* fix new launch * fix ps uit
-