1. 21 4月, 2022 9 次提交
  2. 20 4月, 2022 11 次提交
  3. 19 4月, 2022 13 次提交
    • Z
      [cherry-pick] add rsqrt, equal_all, expand yaml and unittest (#41443, #41540) (#41965) · 018245d8
      zyfncg 提交于
      * add rsqrt yaml and unittest (#41443)
      
      * Add expand equal all yaml (#41540)
      
      * add expand, poisson
      
      * add poison grad
      
      * add expand equal_all poisson triangular solve yaml
      Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
      018245d8
    • Z
      [XPUPS]add rename for heter_ps.cu (#41922) (#41968) · 13202ff7
      zmxdream 提交于
      * add rename for heter_ps.cu
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      13202ff7
    • W
      [Eager] Fix numpy interface for constructing empty tensor (#41904) (#41954) · 551e9140
      Weilong Wu 提交于
      * [Eager] Fix numpy interface for constructing empty tensor
      
      * Fix CI, construct empty tensor
      
      * Modify empty tensor's shape from [] to [0]
      
      * Add more test for constructing empty tensor
      551e9140
    • Z
      21c333df
    • Y
      [Cherry-pick 2.3] Autotune the workspace and kernel choosing of conv (#41833) · b4adbe5c
      Yiqun Liu 提交于
      Cherry-pick #40338 #41741 #41313
      b4adbe5c
    • F
      [cherry-pick] XPUPS Adaptation (#41917) · a9d8b947
      Fan Zhang 提交于
      * XPUPS Adaptation (#40991)
      
      * Adapt XPUPS - 1st version - 3.24
      
      * Adapt XPUPS - update XPU PushSparse -  2nd version - 3.24
      
      * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * Adapt XPUPS - modify by compilation - 4th version - 3.27
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * heter_comm update
      
      * heter_comm update
      
      * update calc_shard_offset. test=develop
      
      * heter_comm update
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30
      
      * update. test=develop
      
      * update pslib.cmake
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 6th version - 3.30
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * used by minxu
      
      * update heter_comm_inl
      
      * fix. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 7th version - 3.30
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 3.31 update
      
      * Adapt XPUPS - update kp compilation path  - 8th version - 3.31
      
      * add optimizer kernel. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm.h 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 9th version - 4.1
      
      * update hashtable. test=develop
      
      * fix. test=develop
      
      * update hashtable 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 10th version - 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1 19:30
      
      * fix. test=develop
      
      * update ps_gpu_wrapper.kps 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 11th version - 4.1
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 12nd version - 4.2
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.2
      
      * 4.2 update
      
      * fix. test=develop
      
      * template init. test=develop
      
      * update 4.6
      
      * fix. test=develop
      
      * template init. test=develop
      
      * 4.6 modify by compilation
      
      * hashtable template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 13nd version - 4.7
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * update by pre-commit
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.12 update
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 14th version - 4.13
      
      * 4.13 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 modify by merged latest compilation
      
      * retry CI 4.14
      
      * 4.15 pass static check
      
      * 4.15 modify by gpups CI
      
      * 3.16 update by gpups CI - modify ps_gpu_wrapper.h
      
      * 4.16 update
      
      * 4.16 pass xpu compile
      
      * 4.16 retry CI
      
      * 4.16 update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      
      * modify ps_gpu_wrapper.cc
      
      * update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      a9d8b947
    • F
      add trt supoort for slice op (#41467) (#41911) · 7ec1e9af
      feng_shuai 提交于
      7ec1e9af
    • F
      add div plugin and add filter (#41243) (#41908) · 15d30815
      feng_shuai 提交于
      15d30815
    • Z
      [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode (#41668) (#41895) · 68643a9e
      Zhanlue Yang 提交于
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      
      * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode
      
      * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode
      
      * Enabled more test cases
      
      * Fixed performance issues
      
      * Fixed minor issue
      68643a9e
    • J
      fix_poo2d_trt_convert (#41860) (#41915) · e568268b
      JingZhuangzhuang 提交于
      e568268b
    • J
      fix infer gpu strategy (#41925) · aa6eb0e8
      JingZhuangzhuang 提交于
      aa6eb0e8
    • T
      cinn_launch_op: optimize the overhead of preparing variables before executing... · dab7dfbf
      TeFeng Chen 提交于
      cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) (#41910)
      
      cherry-pick #41777
      * optimize preparation overhead before executing cinn compiled program
      dab7dfbf
    • Z
      [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode (#41612) (#41894) · 0fb06e46
      Zhanlue Yang 提交于
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      
      * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode
      
      * Fixed minor issues
      0fb06e46
  4. 18 4月, 2022 7 次提交
    • L
      6449a232
    • L
      update (#41756) · 97d1ab2a
      lilong12 提交于
      97d1ab2a
    • R
      fix bugs in moe (#41903) · f92dbfb7
      Roc 提交于
      * fix moe apis (#41650)
      
      * Moe ref (#41836)
      
      * moe ref
      
      * ref commit
      
      * update; document_fix
      
      * update;document_fix
      
      * Moe ref (#41864)
      
      * moe ref
      
      * ref commit; document_fix
      
      * update; document_fix
      
      * update document_fix
      
      * update; document_fix
      f92dbfb7
    • Z
      [cherry-pick]XPUPS add support for kunlun2 (#41916) · 3a2fb4cf
      zmxdream 提交于
      * [XPUPS]add support for kunlun2 (#40985)
      
      
      [XPUPS]add support for kunlun2
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]fix hashtable_kernel.kps (#41790)
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix hashtable_kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]modify xpu_kp.cmake with HETERPS&PSLIB (#41760)
      
      * modify xpu_kp.cmake with HETERPS&PSLIB
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      3a2fb4cf
    • C
      [Phi]Reduce kernels into multiply files (#41747) (#41854) · 688f4ec0
      chentianyu03 提交于
      * split reduce_kernel
      
      * rm reduce_kernel in cmake
      
      * split reduce_grad kernels
      
      * fix cmake build error
      
      * format code
      
      * fix standalone_executor_test error
      688f4ec0
    • Z
      [DoubleGrad] Enabled double grad test cases in eager_mode for... · a367fbab
      Zhanlue Yang 提交于
      [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893)
      
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      a367fbab
    • H
      [Cherry-Pick] take along axis bug fix (#41863) · e7980adf
      huangxu96 提交于
      This PR is the cherry-pick of #41824
      
      This PR fixes a bug that will cause the Cuda address error. The reason for this bug is that the grid number of the Cuda Kernel had been wrongly set.
      e7980adf