1. 27 6月, 2022 1 次提交
  2. 25 6月, 2022 1 次提交
  3. 24 6月, 2022 1 次提交
    • A
      [cherry-pick] NVIDIA fixes (#43780) · 9edbe4aa
      Aganlengzi 提交于
      * Use all sitepackages path as the library/include path (#42940)
      
      * Fix several unit tests and increase the unit tests stability (#43670)
      
      * Reduce gather op unit tests size and increase the timeout
      
      * Add NVIDIA_TF32_OVERRIDE for multi-processes environment
      
      * Remove record test for device event ut
      
      * Fix 3 unittest errors (#43532)
      
      * Fix test_fuse_resnet_unit failure
      
      * Fix test_imperative_auto_mixed_precision failure
      
      * Fix sparse_attention_op error
      
      * Fix sparse_attention_op error
      
      * Use fixed random seed (#43659)
      
      * for CI test_collective_sendrecv_api
      Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
      Co-authored-by: NShijie <505749828@qq.com>
      9edbe4aa
  4. 23 6月, 2022 4 次提交
  5. 22 6月, 2022 4 次提交
    • C
      Cherry pick 43307 (#43618) · d0bbf46c
      ccrrong 提交于
      * add bilinear_interp_v2 converter
      
      * update op_teller.cc
      
      * add unittest for bilinear_interp_v2 converter
      
      * code format
      
      * bug fix
      
      * code format and add unittest
      
      * remove merged modify in op_teller.cc
      
      * code format
      
      * code format
      
      * fix scale init error
      d0bbf46c
    • Y
      Optimize linspace to avoid GPU -> CPU copy. (#42750) (#43746) · 4dcfc6df
      Yiqun Liu 提交于
      cherry-pick #42750。
      
      QA反馈,#42750 优化后,solov2模型性能可提升6%,故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下,该pr不在release/2.3分支中,故将#42750 中python修改同步到fluid.layers.tensor.linspace中。
      4dcfc6df
    • Z
      [cherry pick] Support optional residual add in fused ops and slice large... · 0660d5f2
      Zhang Ting 提交于
      [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719)
      
       [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax
      
      cherry-pick #43635 #43681 #43474
      0660d5f2
    • S
      test=document_fix;cherry pick code format check upgrade to release/2.3 (#43732) · 8e6a1945
      Sing_chan 提交于
          Only cherry pick format tool(clang-format, yapf, cmake-format) upgrade to release/2.3, lint tool such as cpplint will not move, because we are not going to fix cpplint error in release/2.3
          pre_commit.sh also is moved to release/2.3 so that both PR-CI-pre-commit and PR-CI-pre-commit-23 can works.
          pre install clang-format to avoid repeat installation due to pre-commit's multi-thread running.
      8e6a1945
  6. 21 6月, 2022 4 次提交
  7. 20 6月, 2022 1 次提交
  8. 17 6月, 2022 2 次提交
  9. 15 6月, 2022 1 次提交
  10. 14 6月, 2022 1 次提交
    • X
      [ CherryPick ] Cherry pick for einsum optimization. (#43468) · 22e75d92
      xiongkun 提交于
      * [EinsumOp] Polish forward logic and backward logic for optimize (#42603)
      
      * change logic for optimize
      
      * modifty
      
      * merge
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010)
      
      * [EinsumOp] Make EinsumOp support bfloat16. (#43085)
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0
      
      * make EInsumOP support bf16
      
      * add unittest for BF16
      
      * add condition for test_BF16
      
      * fix bugs
      
      * fix
      
      * change the backward api to fit einsum op
      22e75d92
  11. 09 6月, 2022 1 次提交
  12. 08 6月, 2022 2 次提交
  13. 06 6月, 2022 1 次提交
    • N
      cherry-pick 42645 (#43205) · 835a1888
      niuliling123 提交于
      删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。
      从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR.
      Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
      835a1888
  14. 30 5月, 2022 2 次提交
  15. 26 5月, 2022 1 次提交
  16. 23 5月, 2022 1 次提交
  17. 17 5月, 2022 1 次提交
  18. 11 5月, 2022 1 次提交
  19. 10 5月, 2022 4 次提交
  20. 09 5月, 2022 1 次提交
  21. 07 5月, 2022 2 次提交
  22. 05 5月, 2022 1 次提交
  23. 04 5月, 2022 2 次提交
    • S
      graph partition (#42472) · a3917625
      seemingwang 提交于
      * enable graph-engine to return all id (#42319)
      
      * enable graph-engine to return all id
      
      * change vector's dimension
      
      * change vector's dimension
      
      * enlarge returned ids dimensions
      
      * change sample result's structure to fit training (#42426)
      
      * enable graph-engine to return all id
      
      * change vector's dimension
      
      * change vector's dimension
      
      * enlarge returned ids dimensions
      
      * add actual_val
      
      * change vlog
      
      * fix bug
      
      * bug fix
      
      * bug fix
      
      * fix display test
      
      * singleton of gpu_graph_wrapper
      
      * change sample result's structure to fit training
      
      * recover sample code
      
      * fix
      
      * secondary sample
      
      * add graph partition
      
      * fix pybind
      Co-authored-by: NDesmonDay <908660116@qq.com>
      Co-authored-by: NDesmonDay <908660116@qq.com>
      a3917625
    • H
      fix paddle-ort python bug (#42464) (#42470) · 87e6149c
      heliqi 提交于
      * fix paddle-ort python bug
      
      * fix paddle-ort python bug
      87e6149c