1. 07 9月, 2023 1 次提交
    • Z
      [NewIR] Update send recv infermeta and add unittest (#56794) · 2857fdbb
      zhaoyingli 提交于
      * [NewIR]Update send recv infermeta and add unittest
      
      * rm new ir flag
      
      * rm fluid api
      
      * skip runing startup prog
      
      * update flag name
      
      * update recv_v2 yaml
      
      * fix conflict
      
      * unittest only for pp
      
      * fix cmakelist
      
      * unittest check precision
      
      * control random
      
      * fix cmakelist
      2857fdbb
  2. 06 9月, 2023 2 次提交
  3. 05 9月, 2023 4 次提交
    • L
      Add attributes to support to analyse the stream across interpreters (#56814) · f5497fd0
      lzydev 提交于
      * fix static_build for pp
      
      * add mannual_event to support streams across progs
      
      * revert static_build.sh
      
      * fix coverage-ci
      
      * modify the method to name events
      
      * change code according to review
      f5497fd0
    • W
      fix some bugs for amp and test case test_tuning_recompute_with_amp.py (#56864) · e9e07a19
      Wennie396 提交于
      * replace amp.use_pure_fp16 with amp.dtype and amp.level
      
      * old api still use use_pure_fp16
      
      * test_fuse_adamw_pass still use use_pure_fp16
      
      * add test case tuning recompute with amp(float16,o2)
      
      * reset new test case properties TIMEOUT 60
      
      * set smaller value of batch_size and batch_num
      
      * deepcopy dist_context fix _rename_input problem
      
      * fix loss name after cast
      
      * set tuning.enable=True and use engine._tune()
      
      * restore some changes in _rename_input()/_rename_output()
      
      * add self.amp_dtype for _cast_loss() in auto_parallel_amp.py
      
      * fix insert op index in _cast_loss()
      e9e07a19
    • [xdoctest][task 184-185] reformat example code with google style in... · 1a15a351
      小飞猪 提交于
      [xdoctest][task 184-185] reformat example code with google style in `distributed/auto_parallel/static/*` (#56666)
      
      * [Doctest]fix No.184,185, test=docs_preview
      
      * add env skip
      
      * fix @staticmethod
      
      * fix
      
      * add xdoctest for v2
      
      * fix
      1a15a351
    • iSerendipity's avatar
      [xdoctest][task 224-225] reformat example code with google style in... · 53d0869f
      iSerendipity 提交于
      [xdoctest][task 224-225] reformat example code with google style in `python/paddle/distributed/fleet` (#56815)
      
      * [Doctest]fix No.224-225, test=docs_preview
      
      * fix the AttributeError
      53d0869f
  4. 04 9月, 2023 1 次提交
  5. 01 9月, 2023 2 次提交
  6. 31 8月, 2023 5 次提交
  7. 30 8月, 2023 2 次提交
    • G
      [Auto Parallel] Compatible new comm library upgrade (#56604) · ade51aa5
      Ghost Screaming 提交于
      * for verify
      
      fluid operator support new comm library
      
      * u
      
      * u
      
      * u
      
      * compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter.
      
      * Remove useless comments in process_group.py
      
      * Polish code style.
      
      * Fix some problems.
      
      * Remove use fluid api in phi comm_context_manager.
      
      * Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement.
      
      * Fix bug of HIP architecture.
      
      * Fix some problems.
      1. remove useless loggings.
      2. Fix conditional compilation for HIP.
      3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first,
      then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call
      paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not
      set, as a result, NCCLCommContext's dev_ctx is not initialized.
      
      * Fix some problems.
      
      * Polish code.
      
      * Polish code.
      
      * Revert compatiable upgrade for communication operators. Their upgrades
      will be submitted in another PR.
      
      * Remove StaticTCPStore.
      
      * Remove useless modification.
      
      * Remove useless set_cuda_device_id.
      
      * Polish code.
      
      * Remove fluid header files in phi files.
      
      * Remove useless comments.
      
      * Fix problems of hip arch.
      
      * Fix some problems.
      
      * Polish code.
      
      * Polish code style.
      
      ---------
      Co-authored-by: TaoTao Li's avatarhitywt <yuwentao126@126.com>
      ade51aa5
    • [xdoctest] reformat example code with google style in No.307 (#56595) · 34eecb0e
      张春乔 提交于
      * weight_norm_hook
      
      * Update weight_norm_hook.py
      
      * Update weight_norm_hook.py
      
      * Update python/paddle/nn/utils/weight_norm_hook.py
      
      * Update python/paddle/nn/utils/weight_norm_hook.py
      
      * Update python/paddle/nn/utils/weight_norm_hook.py
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * xdoc
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      ---------
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      34eecb0e
  8. 29 8月, 2023 2 次提交
  9. 28 8月, 2023 2 次提交
  10. 25 8月, 2023 4 次提交
  11. 24 8月, 2023 1 次提交
  12. 23 8月, 2023 1 次提交
  13. 22 8月, 2023 5 次提交
  14. 21 8月, 2023 2 次提交
  15. 19 8月, 2023 1 次提交
  16. 18 8月, 2023 1 次提交
  17. 17 8月, 2023 1 次提交
  18. 16 8月, 2023 2 次提交
    • G
      Add mp_all_reduce asynchronize overlap. (#55662) · 6b1dfb5f
      Ghost Screaming 提交于
      * [WIP] Add mp_all_reduce asynchronize overlap.
      
      * Fix some problems.
      
      * Fix dw compute bug, and use a temporary solution to achieve overlap.
      
      * Use fused_linear_param_grad_add to compute dw.
      
      * Reformat ColumnParallel _overlap_linear. Use environment flags to
      control following behaviors:
      1. export Flags_mp_aysnc_allreduce=True to turn on mp async all_reduce
      2. export Flags_skip_mp_c_identity=True to skip two c_identity operators
         in dygraph mode.
      3. export Flags_fused_linear_param_grad_add to enable fused_linear_param_grad_add
         in ColumnParallel backward with mp async all_reduce.
      
      * Polish code.
      
      * Remove useless communication API.
      
      * Fix some problems in mp_async_all_reduce and skip_c_identity.
      
      * Add test cases.
      
      * Remove environment variable Flags_fused_linear_param_grad_add in test case.
      
      * Reset error threshold.
      
      * Reset threshold in test case.
      
      * Add useful log. Remove useless test cases.
      6b1dfb5f
    • Z
      make params_grads order same bewteen dynamic and auto_parallel (#56126) · 496422e9
      zhaoyingli 提交于
      * make params_grads order same bewteen dynamic and static mode
      
      * revert inplace clip
      
      * use sorted attribute to control
      
      * tiny fix
      
      * fix find loss_grad_op
      496422e9
  19. 15 8月, 2023 1 次提交