1. 19 4月, 2022 12 次提交
    • Y
      [Cherry-pick 2.3] Autotune the workspace and kernel choosing of conv (#41833) · b4adbe5c
      Yiqun Liu 提交于
      Cherry-pick #40338 #41741 #41313
      b4adbe5c
    • F
      [cherry-pick] XPUPS Adaptation (#41917) · a9d8b947
      Fan Zhang 提交于
      * XPUPS Adaptation (#40991)
      
      * Adapt XPUPS - 1st version - 3.24
      
      * Adapt XPUPS - update XPU PushSparse -  2nd version - 3.24
      
      * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * Adapt XPUPS - modify by compilation - 4th version - 3.27
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * heter_comm update
      
      * heter_comm update
      
      * update calc_shard_offset. test=develop
      
      * heter_comm update
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30
      
      * update. test=develop
      
      * update pslib.cmake
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 6th version - 3.30
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * used by minxu
      
      * update heter_comm_inl
      
      * fix. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 7th version - 3.30
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 3.31 update
      
      * Adapt XPUPS - update kp compilation path  - 8th version - 3.31
      
      * add optimizer kernel. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm.h 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 9th version - 4.1
      
      * update hashtable. test=develop
      
      * fix. test=develop
      
      * update hashtable 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 10th version - 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1 19:30
      
      * fix. test=develop
      
      * update ps_gpu_wrapper.kps 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 11th version - 4.1
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 12nd version - 4.2
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.2
      
      * 4.2 update
      
      * fix. test=develop
      
      * template init. test=develop
      
      * update 4.6
      
      * fix. test=develop
      
      * template init. test=develop
      
      * 4.6 modify by compilation
      
      * hashtable template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 13nd version - 4.7
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * update by pre-commit
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.12 update
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 14th version - 4.13
      
      * 4.13 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 modify by merged latest compilation
      
      * retry CI 4.14
      
      * 4.15 pass static check
      
      * 4.15 modify by gpups CI
      
      * 3.16 update by gpups CI - modify ps_gpu_wrapper.h
      
      * 4.16 update
      
      * 4.16 pass xpu compile
      
      * 4.16 retry CI
      
      * 4.16 update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      
      * modify ps_gpu_wrapper.cc
      
      * update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      a9d8b947
    • F
      add trt supoort for slice op (#41467) (#41911) · 7ec1e9af
      feng_shuai 提交于
      7ec1e9af
    • F
      add div plugin and add filter (#41243) (#41908) · 15d30815
      feng_shuai 提交于
      15d30815
    • z8hanghuan's avatar
      Revert "modify xpu.cmake,*test=kunlun (#41832)" · f293bcb8
      z8hanghuan 提交于
      This reverts commit 8ccdb91b.
      f293bcb8
    • Z
      [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode (#41668) (#41895) · 68643a9e
      Zhanlue Yang 提交于
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      
      * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode
      
      * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode
      
      * Enabled more test cases
      
      * Fixed performance issues
      
      * Fixed minor issue
      68643a9e
    • J
      fix_poo2d_trt_convert (#41860) (#41915) · e568268b
      JingZhuangzhuang 提交于
      e568268b
    • J
      fix infer gpu strategy (#41925) · aa6eb0e8
      JingZhuangzhuang 提交于
      aa6eb0e8
    • Z
      Add kernel sparse_mask_helper; sparse_coo_tensor_grad (#41586) (#41902) · 44d8c6ed
      zhangkaihuo 提交于
      cherry-pick the PR#41586 to realese/2.3
      44d8c6ed
    • T
      cinn_launch_op: optimize the overhead of preparing variables before executing... · dab7dfbf
      TeFeng Chen 提交于
      cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) (#41910)
      
      cherry-pick #41777
      * optimize preparation overhead before executing cinn compiled program
      dab7dfbf
    • Z
      [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode (#41612) (#41894) · 0fb06e46
      Zhanlue Yang 提交于
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      
      * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode
      
      * Fixed minor issues
      0fb06e46
    • S
      Optimization for graph_sample_neighbors API (#41447) (#41897) · 6115b016
      Siming Dai 提交于
      * add eids result for graph_sample_neighbors
      
      * fix bug
      
      * move fisher_yates sample to warp
      
      * add cpu eid output
      
      * delete comment
      
      * delete comment
      
      * change nullptr placeholder
      
      * optimize sample kernel
      
      * fix mutable_data
      6115b016
  2. 18 4月, 2022 11 次提交
    • L
      6449a232
    • L
      update (#41756) · 97d1ab2a
      lilong12 提交于
      97d1ab2a
    • A
      [Eager] Add _fallback_legacy_dygraph for npu/xpu/rocm (#41774) (#41898) · 96c95b3d
      Aurelius84 提交于
      * [Eager] add _fallback_legacy_dygraph for npu/xpu/rocm
      
      * fix import
      96c95b3d
    • R
      fix bugs in moe (#41903) · f92dbfb7
      Roc 提交于
      * fix moe apis (#41650)
      
      * Moe ref (#41836)
      
      * moe ref
      
      * ref commit
      
      * update; document_fix
      
      * update;document_fix
      
      * Moe ref (#41864)
      
      * moe ref
      
      * ref commit; document_fix
      
      * update; document_fix
      
      * update document_fix
      
      * update; document_fix
      f92dbfb7
    • Z
      [cherry-pick]XPUPS add support for kunlun2 (#41916) · 3a2fb4cf
      zmxdream 提交于
      * [XPUPS]add support for kunlun2 (#40985)
      
      
      [XPUPS]add support for kunlun2
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]fix hashtable_kernel.kps (#41790)
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix hashtable_kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]modify xpu_kp.cmake with HETERPS&PSLIB (#41760)
      
      * modify xpu_kp.cmake with HETERPS&PSLIB
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      3a2fb4cf
    • z8hanghuan's avatar
      modify xpu.cmake,*test=kunlun (#41832) · 8ccdb91b
      z8hanghuan 提交于
      * modify xpu.cmake,*test=kunlun
      
      * modify xpu.cmake,*test=kunlun
      
      * modify xpu.cmake,*test=kunlun
      
      * modify xpu.cmake,*test=kunlun
      8ccdb91b
    • C
      [Phi]Reduce kernels into multiply files (#41747) (#41854) · 688f4ec0
      chentianyu03 提交于
      * split reduce_kernel
      
      * rm reduce_kernel in cmake
      
      * split reduce_grad kernels
      
      * fix cmake build error
      
      * format code
      
      * fix standalone_executor_test error
      688f4ec0
    • Z
      [DoubleGrad] Enabled double grad test cases in eager_mode for... · a367fbab
      Zhanlue Yang 提交于
      [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893)
      
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      a367fbab
    • H
      [Cherry-Pick] take along axis bug fix (#41863) · e7980adf
      huangxu96 提交于
      This PR is the cherry-pick of #41824
      
      This PR fixes a bug that will cause the Cuda address error. The reason for this bug is that the grid number of the Cuda Kernel had been wrongly set.
      e7980adf
    • J
      Add eager string tensor (#41039) (#41839) · 623f8308
      Jack Zhou 提交于
      * Add core.eager.StringTensor __init__ which pyarray args can be passed
      
      * Add the numpy method of core.eager.StringTensor
      
      * revert tensor.to_string modification
      
      * Add ToPyObject for core.eager.StringTensor
      
      * Add debug string for core.eager.StringTensor
      
      * Remove place args of core.eager.StringTensor temporarily
      
      * Fix check string_tensor error
      
      * remove dtype of core.eager.StringTensor
      
      * add core.eager.StringTensor unittest
      
      * remove pstring from VarDesc
      
      * Add InitStringTensorWithStringTensor
      
      * Remove to_string modification
      
      * Remove zero_copy arg from StringTensor creator
      623f8308
    • C
      [Cherry-pick] Organize the API of custom operators (#41882) · 897911fc
      Chen Weihang 提交于
      * [Phi&CustomOp] Remove deprecated enum PlaceType for custom op & add warning (#41647)
      
      * remove old custom op placetype
      
      * replace dist  placetype using
      
      * add with gpu macro
      
      * fix mutable_data error
      
      * fix set value error
      
      * add comment
      
      * remove all is initialized using (#41766)
      
      * remove inner_place using (#41768)
      
      * polish tensor depreacted method warning (#41807)
      
      * [CustomOp] Fix PlaceType related compat error (#41826)
      
      * fix place type related compat error
      
      * fix test failed
      
      * remove dll decl
      
      * revert place type change
      
      * add dll decl
      
      * resolve conflict
      897911fc
  3. 15 4月, 2022 10 次提交
  4. 14 4月, 2022 7 次提交
    • C
      [CustomOp]Add new method for custom double grad (#41538) (#41781) · 76d5483a
      Chen Weihang 提交于
      * add new method for custom double grad
      
      * add tanh double grad unittest
      
      * change year
      
      * revert tensor init method
      76d5483a
    • C
      [CustomOp] Add context pool unittests (#41085) (#41782) · 5450e42c
      Chen Weihang 提交于
      * add context pool unittests
      
      * fix timeout
      
      * polish details
      
      * change option pos
      
      * add dll decl for wndows
      
      * fix pre-commit error
      
      * move dll_decl and export DeviceContext
      
      * replace lost dll_decl.h
      5450e42c
    • Z
      Adjusted CUDA Arches (#41754) · 1c15af3e
      Zhanlue Yang 提交于
      1c15af3e
    • C
      Cherry pick final state ops (#41755) · 921a6fb7
      chentianyu03 提交于
      * [Yaml]add exp yaml (#41217)
      
      * add exp yaml
      
      * add exp api in test case
      
      * add determinant yaml
      
      * fix exp op unittest
      
      * change test class name
      
      * modify api name
      
      * compacted with raw api
      
      * fix det api
      
      * add python_api
      
      * add test eager for determinant op
      
      * [Yaml] Add assign yaml (#41428)
      
      * add assign yaml
      
      * add assign api
      
      * add assign backward api
      
      * add assign
      
      * add assign yaml
      
      * add assign
      
      * assign yaml
      
      * add assign raw kernel and use assign_raw in yaml
      
      * merge develop branch
      
      * add missing python_api
      
      * exchange assign and assign_raw kernel name (#41625)
      
      * exchange assign and assign_raw kernel name
      
      * fix register error
      
      * [Yaml]add gaussian_random yaml and test case (#41312)
      
      * add guassian random yaml
      
      * add gaussian_random yaml and test case
      
      * fix error modify of full yaml
      
      * import in_dygraph_mode
      
      * import _in_legacy_dygraph
      
      * add place arg in api
      
      * import __current_expected_place
      
      * fix test_egr_python_api failed case
      
      * add test case
      
      * add cast for NormalInitializer
      
      * fix test error
      
      * fix test error
      
      * rm unsed check code
      
      * fix test error in test_initializer_nn
      
      * modify by review
      
      * [Phi]fix split error when sections has 0 size and add test case (#41708)
      
      * fix split error when sections has 0 size and add test case
      
      * fix test case
      921a6fb7
    • X
    • W
      add fp16 kernel to clip_grad (#41675) · d447c678
      wuyefeilin 提交于
      d447c678
    • C
      fix new dygraph record event (#41715) (#41771) · 4be96ad9
      chenjian 提交于
      * fix new dygraph record event
      
      * refine name
      
      * fix
      
      * fix
      
      * fix according to review
      4be96ad9