1. 02 2月, 2023 1 次提交
  2. 19 1月, 2023 1 次提交
    • H
      [cherry-pick]Fix paddle.queeze_ bug (#49937) · 34fafb11
      heliqi 提交于
      * Fix paddle.queeze_ bug (#49903)
      
      * fix queeze_ bug
      
      * fix slove use squeeze_kernel
      
      * fix slove use squeeze_kernel
      
      * fix slove use squeeze_kernel
      
      * add test case
      
      * Update squeeze_kernel.h
      34fafb11
  3. 04 1月, 2023 1 次提交
  4. 03 1月, 2023 2 次提交
  5. 30 12月, 2022 1 次提交
  6. 29 12月, 2022 1 次提交
  7. 28 12月, 2022 1 次提交
  8. 27 12月, 2022 1 次提交
    • H
      [Cherry-pick] Fix custom operator backward=None (#48656) (#48715) · 39eb77a6
      HongyuJia 提交于
      * [Release2.4] Revert python link prs (#48573)
      
      * Revert "Fix mac link python (#48017)"
      
      This reverts commit 3fa7a736.
      
      * Revert "[Cherry-pick] Fix python link error (#47811)"
      
      This reverts commit ff642c68.
      
      * Update config.go
      
      * fix custom operator backward=None (#48656)
      
      * [Custom Extension] Fix custom double_grad backward=None (#49224)
      
      * fix custom double_grad backward=None
      
      * fix custom_relu.cu bug && polish testcase of double_grad
      
      * remove old dynamic graph test
      
      * add import fluid
      
      * add import fluid
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      39eb77a6
  9. 22 12月, 2022 1 次提交
  10. 21 12月, 2022 2 次提交
  11. 28 11月, 2022 1 次提交
    • Z
      Cherrypick NV fixes to release/2.4 (#48263) · 7a0b8625
      zlsh80826 提交于
      * Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)
      
      * Add missing fp32 config and reduce the testing combination
      
      * Reduce trt matmul pass test max examples
      
      * Loose TRT fp16 tests tolerance (#47100)
      
      * Loose TRT half test tolerance to 1e-3 (#47101)
      
      * Loose TRT half test tolerance to 1e-3 (#47106)
      
      * Update distributed_strategy.proto (#46531)
      
      * Close popen pipe after used (#47053)
      
      * Add launch_bounds (#47285)
      
      * Fix TRT UT failures (#47488)
      
      * Format cherry-picked commits
      
      * CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203)
      
      * Skip tests that use fused_ops on H100
      
      * Add error message to FusedOps on H100
      Co-authored-by: NShijie <505749828@qq.com>
      Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
      Co-authored-by: NTian Zheng <tizheng@nvidia.com>
      7a0b8625
  12. 24 11月, 2022 1 次提交
  13. 07 11月, 2022 2 次提交
  14. 04 11月, 2022 2 次提交
    • X
      [CherryPick] Cherry pick #45916 #46031 #47299 (#47610) · 72e1eb6b
      xiongkun 提交于
      * [ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916)
      
      * fix select_input with different shape errors:
      1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var
      2. the output shape of select_input is inferred from inputs.
      
      * reverse the logic in select_input
      
      * [warning] added warning message in cond block when one branch returns variable and another returns None (#46031)
      
      * [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931)
      
      * Allow manaully set py_reader name in standalone executor
      
      * [BugFix] while cond receives dict as input (#47299)
      
      * fix bugs while cond receives dict as input
      
      * add unittest
      
      * change flatten -> _is_sequence_except_dict
      
      * code format
      Co-authored-by: Nfeifei-111 <wuzhanfei@baidu.com>
      72e1eb6b
    • L
      [cherry-pick2.4]for CodeStyle (#47608) · cfee9c13
      Ligoml 提交于
      * only run pre-commit
      
      * only run pre-commit
      cfee9c13
  15. 03 11月, 2022 2 次提交
  16. 31 10月, 2022 2 次提交
  17. 28 10月, 2022 1 次提交
  18. 27 10月, 2022 2 次提交
  19. 26 10月, 2022 1 次提交
  20. 25 10月, 2022 1 次提交
  21. 20 10月, 2022 8 次提交
  22. 19 10月, 2022 4 次提交
    • Z
      [Cherry-Pick][AutoParallel] auto_parallel cherry-pick to release2.4 (#47145) · 90b31790
      zhaoyingli 提交于
      * [Auto Parallel] Make Engine class callable (#46416)
      
      * [Auto Parallel] Imporve the user-defined fetches and logging
      
      * [Auto Parallel] Make Engine class callable
      
      * [Auto Parallel] Update the data loading of tuner
      
      * Print IPS in auto parallel Engine (#46554)
      
      * [AutoParallel] fix dist_split (#46505)
      
      * [AutoParallel] fix dist_split
      
      * add unittest
      
      * update cmakelist
      
      * [AutoParallel] fix sharding (#46572)
      
      * [AutoParallel] fix process_mesh (#46583)
      
      * [AutoParallel] fix reshard when train with eval (#46605)
      
      * [AutoParallel] fix reshard when train with eval
      
      * fix mppp
      
      * [AutoParallel] fix amp when predict (#46637)
      
      * [Auto Parallel]Update comp cost and completion for gpt auto search (#46387)
      
      * update comp cost and completion for gpt auto search
      
      * add unittest
      
      * [Auto Parallel] Fix bugs caused by the inconsistent outputs of Engine API (#46633)
      
      * [Auto Parallel] Unify the logger and outputs of Engine API
      
      * [Auto Parallel] Fix the bugs of to_static
      
      * [Auto Parallel] Adjust the test_to_static.py
      
      * [Auto Parallel] Improve the fine-grained APIs (#46552)
      
      * [Auto Parallel] Suppport different dataloaders
      
      * [Auto Parallel] Add num_shards config for dataset
      
      * [Auto Parallel] Unify the logger and outputs of Engine API
      
      * [Auto Parallel] Fix the bugs of to_static
      
      * [Auto Parallel] Adjust the test_to_static.py
      
      * [Auto Parallel] Add the prepare API and replace __call__ with run
      
      * [Auto Parallel] Improve the private implementations of Engine
      
      * [Auto Parallel] Set capacity of dataloader for opt tuning
      
      * [Auto Parallel] [WIP] Change the fine-grained API
      
      * [Auto Parallel] Improve APIs to support different user cases
      
      * [Auto Parallel] Add removed config
      
      * [Auto Parallel] Add imports
      
      * [Auto Parallel] Fix bugs for to_static
      
      * [Auto Parallel] Remove unnecessary imports
      
      * bugfix (#46921)
      
      * [Auto Parallel] Fix the bug for None labels (#46987)
      
      * [AutoParallel] adapt for gpt-gen (#46771)
      
      * for gpt-gen
      
      * fix reshard
      
      * adapt assign and shape op
      
      * add dist_assign & unittest
      
      * add conditional block unittest
      
      * rename unittest
      
      * [Auto Parallel] Fix the bug of completion (#47056)
      
      * [Auto Parallel] Fix the bug for None labels
      
      * [Auto Parallel] Fix the completion bug
      
      * [AutoParallel] add callbacks (#47014)
      
      * [AutoParallel] add callbacks
      
      * fix unittest
      
      * fix dist_context
      
      * fix engine
      
      * fix cmakelist
      
      * fix unittest's returns
      
      * fix cmakelist
      
      * [Auto Parallel] Add cost interface (#47043)
      
      * add cost interface
      
      * update inferface and add unittest
      
      * update unittest
      
      * update inferface
      
      * [Auto Parallel]Add parallel tuner (#46189)
      
      * add parallel tuner
      
      * add unittest
      
      * fix unittest
      
      * set timeout of unittest
      
      * set unittest timeout
      
      * fix auto_mode setting
      
      * update unittest
      
      * sync from develop and update unittest
      
      * remove unused import
      
      * update unittest
      
      * update cmakelist
      
      * add unittests
      Co-authored-by: NYulong Ao <aoyulong@baidu.com>
      Co-authored-by: NRuibiao Chen <chenruibiao@baidu.com>
      Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
      Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
      90b31790
    • A
      [Dy2Stat]Polish @to_static temporary file directory to speed up transformation (#47102) (#47144) · 5a9befea
      Aurelius84 提交于
      Polish @to_static temporary file directory to speed up transformation
      5a9befea
    • X
      [CherryPick] Support TypeHint for function decorated by @to_static (#47147) · 247ef477
      xiongkun 提交于
      * [Dy2Static] Support TypeHint for function decorated by @to_static (#47121)
      
      * Add TypeHint Transformer
      
      * add unittest for typehint transformer
      
      * [Dy2Static] Remove GradTransformer (#47063)
      
      * [Dy2Static] Remove GradTransformer
      1. fix einsum infershape bugs.
      2. remove grad_transformer and unify paddle.grad and paddle.static.gradient.
      3. add dygraph_and_dy2static_only decorator for dy2static.
      
      * fix bugs
      
      * rename
      247ef477
    • W
      [Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105) (#47134) · 69515e90
      WangZhen 提交于
      [CherryPick][Dy2St]Fix recurrent op eager deletion pass error in dy2st
      69515e90
  23. 18 10月, 2022 1 次提交