1. 05 9月, 2023 1 次提交
    • W
      fix some bugs for amp and test case test_tuning_recompute_with_amp.py (#56864) · e9e07a19
      Wennie396 提交于
      * replace amp.use_pure_fp16 with amp.dtype and amp.level
      
      * old api still use use_pure_fp16
      
      * test_fuse_adamw_pass still use use_pure_fp16
      
      * add test case tuning recompute with amp(float16,o2)
      
      * reset new test case properties TIMEOUT 60
      
      * set smaller value of batch_size and batch_num
      
      * deepcopy dist_context fix _rename_input problem
      
      * fix loss name after cast
      
      * set tuning.enable=True and use engine._tune()
      
      * restore some changes in _rename_input()/_rename_output()
      
      * add self.amp_dtype for _cast_loss() in auto_parallel_amp.py
      
      * fix insert op index in _cast_loss()
      e9e07a19
  2. 04 9月, 2023 1 次提交
  3. 31 8月, 2023 2 次提交
    • C
      add op cost interface (#56803) · 51ba2a0f
      caozhou 提交于
      51ba2a0f
    • C
      [AutoParallel] Adapt static spmd rules for dynamic graph (#56367) · 54fcd9a9
      Chen Weihang 提交于
      * move matmul spmd rules into phi
      
      * add basic infer spmd utils
      
      * addspmd factory
      
      * fix compile error
      
      * add unittest
      
      * refine infer spmd test and utils
      
      * debug infer spmd test
      
      * adapt python test
      
      * poish details
      
      * change to vector attr arg
      
      * revert needless change
      
      * update matmul spmd rule test
      
      * remove original rule
      
      * polish details
      
      * fix marco error
      
      * add comment
      
      * pass backward test
      
      * fix compile error
      
      * add cmake rule for spmd_rules_test
      
      * add dist meta tensor
      
      * update pybind impl
      
      * add marco for rules
      54fcd9a9
  4. 25 8月, 2023 3 次提交
  5. 24 8月, 2023 1 次提交
  6. 22 8月, 2023 1 次提交
    • C
      [AutoParallel] Polish dist tensor design (#56368) · 8495377a
      Chen Weihang 提交于
      * polish dist teensor design
      
      * adjust constructor
      
      * polish details
      
      * polish details design
      
      * fix compile error
      
      * refactor init tensor impl
      
      * fix reshard test
      
      * polish details
      
      * add unittest for coverage
      8495377a
  7. 16 8月, 2023 2 次提交
    • L
      move test case from cpp to python (#56333) · 163152aa
      LiYuRio 提交于
      163152aa
    • C
      [AutoParallel] Dygraph basic impl for semi auto parallel (#55698) · 7039bef3
      Chen Weihang 提交于
      * add phi forward api gen impl
      
      * add phi backward gen code
      
      * polish api code gen impl
      
      * polish code gen impl
      
      * remove auto_paralel namespace
      
      * add dygraph forward impl
      
      * add for_auto_parallel cond
      
      * fix code gen errors
      
      * add dygraph backward impl
      
      * resolve conflict with develop
      
      * refactor dist api gen impl
      
      * revert origin api gen impl
      
      * replace template for override func
      
      * fix dnnl marco error
      
      * revert third_party change
      
      * add with distributed marco
      
      * Update grad_tensor_holder.cc details
      
      * merge dist tensor constructor
      
      * change test tensor to replicate
      
      * fx typo
      
      * resolve conflict with develop
      
      * fix out dim error
      7039bef3
  8. 15 8月, 2023 1 次提交
  9. 14 8月, 2023 1 次提交
    • Y
      [Semi-Auto] Add reshape spmd rule (#55177) · a97b507e
      Yichen Zhang 提交于
      * add reshape spmd rule
      
      * add unit test for reshape spmd rule
      
      * bug fix
      
      * replace the print_info function with to_string
      
      * fix typo
      
      * bug fix
      
      * add handling for "0" in target shape
      
      * remove the part of computing size in dim_trans.cc
      a97b507e
  10. 10 8月, 2023 1 次提交
  11. 09 8月, 2023 1 次提交
    • L
      remove the... · 723c6f77
      LoneRanger 提交于
      remove the AdamOptimizer、SGDOptimizer、MomentumOptimizer、ModelAverage、LookaheadOptimizer、FtrlOptimizer、DecayedAdagradOptimizer、DpsgdOptimizer in fluid and relocate the ExponentialMovingAverage、PipelineOptimizer、GradientMergeOptimizer and change optimizer base for LarsMomentumOptimizer and RecomputeOptimizer (#55970)
      
      * change the optimizer base for SGDOptimizer
      
      * change the optimizer base for SGDOptimizer
      
      * replace the SGDOptimizer with SGD
      
      * fix bug of sgd
      
      * change the optimizer base for MomentumOptimizer
      
      * fix the remaining tests
      
      * remove the Momentum in fluid/optimizer.py
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * Update test_resnet_cinn.py
      
      * Update test_resnet_prim_cinn.py
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * remove the ModelAverage in fluid
      
      * remove the LookaheadOptimizer in fluid
      
      * fix bug
      
      * remove AdamOptimizer in fluid
      
      * Update test_image_classification_fp16.py
      
      * fix bug
      
      * relocate the ExponentialMovingAverage in fluid
      
      * restore the static api
      
      * remove the FtrlOptimizer in fluid
      
      * remove the DecayedAdagradOptimizer in fluid
      
      * remove the DpsgdOptimizer in fluid
      
      * fix bug
      
      * fix codestyle
      
      * fix bug
      
      * fix bug
      
      * relocate the PipelineOptimizer
      
      * relocate the GradientMergeOptimizer
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix doc
      
      * Update __init__.py
      
      * Update test_fleet_qat_meta_optimizer.py
      
      * change optimizer base for LarsMomentumOptimizer
      
      * fix bug
      
      * fix conflict
      
      * fix code-style
      
      * fix sample codes
      
      * fix bug
      
      * fix bug
      
      * fix cinn bug
      
      * fix bug
      
      * fix bug
      
      * Update qat_optimizer.py
      
      * Update __init__.py
      
      * fix bug
      
      * change optimizer base for RecomputeOptimizer
      
      * fix bug
      
      * fix bug
      
      * Update test_imperative_optimizer_v2.py
      723c6f77
  12. 04 8月, 2023 2 次提交
  13. 02 8月, 2023 1 次提交
  14. 01 8月, 2023 1 次提交
  15. 31 7月, 2023 1 次提交
  16. 24 7月, 2023 3 次提交
  17. 20 7月, 2023 1 次提交
    • J
      [Semi Auto] Entropy SPMD Rule (#55394) · 5f376f00
      JZ-LIANG 提交于
      * base rule
      
      * add sharidng merge
      
      * add sharidng axis merge
      
      * define unified data class for inferencing dist_attr
      
      * test wrap DistTensorSpec in dygraph mode
      
      * matmul main logic done
      
      * shape int64
      
      * common cc
      
      * define unified data class for inferencing dist_attr
      
      * test wrap DistTensorSpec in dygraph mode
      
      * define python api and wrap function in static mode for DistTensorSpec
      
      * revise syntax
      
      * map bugfix
      
      * broadcast func
      
      * compile 1
      
      * add unitest
      
      * add registry
      
      * update unitest
      
      * bugfix
      
      * bugfix
      
      * add pybind
      
      * bugfix
      
      * bugfix macro gloabl name space
      
      * bugfix macro gloabl name space
      
      * pybind
      
      * pybind test
      
      * pybind bugfixed1
      
      * pybind bugfixed2
      
      * pybind unitest
      
      * merge dev
      
      * merge dev
      
      * merge dev
      
      * fixed cmake conflict
      
      * fixed cmake conflict
      
      * rename get method
      
      * revise inferforward output type
      
      * revise comment
      
      * replicated rule
      
      * replicated rule 2
      
      * revert bug deps
      
      * add rule
      
      * add unitest
      
      * add rule
      
      * add unitest
      
      * move ut of auto_parallel
      
      * fix ut
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * resolute input sharding conflict maybe
      
      * fixed comment
      
      * add rule
      
      * add unitest
      
      * fixed typoes
      
      ---------
      Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      5f376f00
  18. 12 7月, 2023 1 次提交
  19. 07 7月, 2023 3 次提交
  20. 06 7月, 2023 2 次提交
  21. 04 7月, 2023 1 次提交
  22. 29 6月, 2023 1 次提交
  23. 27 6月, 2023 1 次提交
  24. 25 6月, 2023 1 次提交
  25. 20 6月, 2023 1 次提交
    • A
      [AutoTuner] Add compare and record (#54668) · 6fe7b5e2
      Azure 提交于
      * add auto tuner
      
      * compare and record module
      
      * revert launch main
      
      * add prune rule
      
      * add unit test
      
      * add auto tuner
      
      * revert launch main
      
      * add prune rule
      
      * modify unit test script
      
      * fix bug for dump nodes; fix bug for checking log file
      
      * fix bug
      
      ---------
      Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
      6fe7b5e2
  26. 14 6月, 2023 3 次提交
    • C
      [AutoTuner] Add auto tuner to obtain optima configuration (#54460) · e12d2867
      caozhou 提交于
      * add auto tuner
      
      * fix prune
      
      * fix sharding prune and mbs candidates
      
      * fix cfg
      
      * fix launch
      
      * fix launch
      
      * add unittest
      
      * fix code style
      e12d2867
    • G
      Fix cuda12 timeout problems. (#54615) · a90d9088
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Fix problem of pickle and NCCL_P2P_DISABLE in distributed testcases in
      cuda12.
      
      * Fix problem of TimeOut of distributed testcases under cuda12.
      
      * Remove useless modification.
      
      * Remove useless modification.
      a90d9088
    • S
      Fix A100 CUDA12 ut (#54487) · a96c6dc7
      sneaxiy 提交于
      * fix A100 CUDA12 ut
      
      * fix ci uts
      
      * fix test_sync_batch_norm_op
      
      * fix sync bn op ut again by separating 2 files
      
      * fix codestyle ci
      
      * combine other PRs
      
      * fix codestyle
      
      * fix codestyle ci
      a96c6dc7
  27. 13 6月, 2023 2 次提交