1. 21 4月, 2022 4 次提交
  2. 20 4月, 2022 2 次提交
    • S
      cherry pick recent updates in graph-engine to release2.3 (#42027) · ef78c9c2
      seemingwang 提交于
      * gpu_graph engine optimization+ (#41455)
      
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * Cpu gpu graph engine (#41942)
      
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * fix linking problem
      
      * remove comment
      ef78c9c2
    • L
      [cherry-pick] Cherry pick pr of new-exec (#42009) · 80992253
      Leo Chen 提交于
      * [new-exec] shrink downstream map (#41471)
      
      * shrink downstream map
      
      * shrink last live ops of var
      
      * add comment
      
      * fix bug
      
      * add dependency for send/recv to support pp parallel (#41652)
      
      * [new-exec] clear the scope listener after run (#41947)
      
      * clear the listener after run
      
      * only sync variables in program
      
      * refine code
      
      * fit for lod_tensor_blocking_queue
      80992253
  3. 19 4月, 2022 4 次提交
    • Z
      [XPUPS]add rename for heter_ps.cu (#41922) (#41968) · 13202ff7
      zmxdream 提交于
      * add rename for heter_ps.cu
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      13202ff7
    • Y
      [Cherry-pick 2.3] Autotune the workspace and kernel choosing of conv (#41833) · b4adbe5c
      Yiqun Liu 提交于
      Cherry-pick #40338 #41741 #41313
      b4adbe5c
    • F
      [cherry-pick] XPUPS Adaptation (#41917) · a9d8b947
      Fan Zhang 提交于
      * XPUPS Adaptation (#40991)
      
      * Adapt XPUPS - 1st version - 3.24
      
      * Adapt XPUPS - update XPU PushSparse -  2nd version - 3.24
      
      * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * Adapt XPUPS - modify by compilation - 4th version - 3.27
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * heter_comm update
      
      * heter_comm update
      
      * update calc_shard_offset. test=develop
      
      * heter_comm update
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30
      
      * update. test=develop
      
      * update pslib.cmake
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 6th version - 3.30
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * used by minxu
      
      * update heter_comm_inl
      
      * fix. test=develop
      
      * Adapt XPUPS - modify by kp compilation  - 7th version - 3.30
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 3.31 update
      
      * Adapt XPUPS - update kp compilation path  - 8th version - 3.31
      
      * add optimizer kernel. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm_kernel.kps 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update heter_comm.h 3.31
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 9th version - 4.1
      
      * update hashtable. test=develop
      
      * fix. test=develop
      
      * update hashtable 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 10th version - 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.1 19:30
      
      * fix. test=develop
      
      * update ps_gpu_wrapper.kps 4.1
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 11th version - 4.1
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 12nd version - 4.2
      
      * fix. test=develop
      
      * fix. test=develop
      
      * modify by compilation 4.2
      
      * 4.2 update
      
      * fix. test=develop
      
      * template init. test=develop
      
      * update 4.6
      
      * fix. test=develop
      
      * template init. test=develop
      
      * 4.6 modify by compilation
      
      * hashtable template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 13nd version - 4.7
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.11 update
      
      * update by pre-commit
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * 4.12 update
      
      * fix. test=develop
      
      * Adapt XPUPS - update by kp compilation  - 14th version - 4.13
      
      * 4.13 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 update
      
      * 4.14 modify by merged latest compilation
      
      * retry CI 4.14
      
      * 4.15 pass static check
      
      * 4.15 modify by gpups CI
      
      * 3.16 update by gpups CI - modify ps_gpu_wrapper.h
      
      * 4.16 update
      
      * 4.16 pass xpu compile
      
      * 4.16 retry CI
      
      * 4.16 update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      
      * modify ps_gpu_wrapper.cc
      
      * update
      Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
      a9d8b947
    • T
      cinn_launch_op: optimize the overhead of preparing variables before executing... · dab7dfbf
      TeFeng Chen 提交于
      cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) (#41910)
      
      cherry-pick #41777
      * optimize preparation overhead before executing cinn compiled program
      dab7dfbf
  4. 18 4月, 2022 3 次提交
    • Z
      [cherry-pick]XPUPS add support for kunlun2 (#41916) · 3a2fb4cf
      zmxdream 提交于
      * [XPUPS]add support for kunlun2 (#40985)
      
      
      [XPUPS]add support for kunlun2
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]fix hashtable_kernel.kps (#41790)
      
      * refactor heter comm kernel
      
      * update. test=develop
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix hashtable_kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      
      * [XPUPS]modify xpu_kp.cmake with HETERPS&PSLIB (#41760)
      
      * modify xpu_kp.cmake with HETERPS&PSLIB
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      3a2fb4cf
    • C
      [Phi]Reduce kernels into multiply files (#41747) (#41854) · 688f4ec0
      chentianyu03 提交于
      * split reduce_kernel
      
      * rm reduce_kernel in cmake
      
      * split reduce_grad kernels
      
      * fix cmake build error
      
      * format code
      
      * fix standalone_executor_test error
      688f4ec0
    • C
      [Cherry-pick] Organize the API of custom operators (#41882) · 897911fc
      Chen Weihang 提交于
      * [Phi&CustomOp] Remove deprecated enum PlaceType for custom op & add warning (#41647)
      
      * remove old custom op placetype
      
      * replace dist  placetype using
      
      * add with gpu macro
      
      * fix mutable_data error
      
      * fix set value error
      
      * add comment
      
      * remove all is initialized using (#41766)
      
      * remove inner_place using (#41768)
      
      * polish tensor depreacted method warning (#41807)
      
      * [CustomOp] Fix PlaceType related compat error (#41826)
      
      * fix place type related compat error
      
      * fix test failed
      
      * remove dll decl
      
      * revert place type change
      
      * add dll decl
      
      * resolve conflict
      897911fc
  5. 15 4月, 2022 2 次提交
  6. 14 4月, 2022 1 次提交
  7. 13 4月, 2022 1 次提交
  8. 11 4月, 2022 3 次提交
  9. 06 4月, 2022 1 次提交
  10. 05 4月, 2022 2 次提交
  11. 04 4月, 2022 2 次提交
  12. 03 4月, 2022 1 次提交
    • C
      [Phi]Concat grad (#41112) · 3f57ef7a
      chentianyu03 提交于
      * add concat_grad kernel
      
      * fix error
      
      * remove comment code
      
      * fix outs nullptr error
      
      * change to phi header
      
      * add concat_grad declare for standalone_executor_test
      3f57ef7a
  13. 02 4月, 2022 5 次提交
  14. 01 4月, 2022 6 次提交
    • L
      fix mac c++ version (#41172) · a2c01db1
      liutiexing 提交于
      * fix mac c++ version
      
      * update
      
      * fix apple systems
      a2c01db1
    • C
      [Phi] Move softmax with cross entropy kernel into phi (#40832) · e6ec98fe
      Chen Weihang 提交于
      * add cross_entropy_with_softmax phi kernel
      
      * remove softmax_with_cross_entropy kernel
      
      * add softmax_with_cross_entropy grad kernel
      
      * remove original op kernel
      
      * refine cross entropy impl
      
      * fix pointer error
      
      * revert kernel cu change
      
      * fix xpu failed
      
      * fix cinn failed
      
      * fix npu failed
      
      * add forward sig
      
      * add check_nan_inf for pt kernel
      
      * remove repeat cmake item
      
      * fix unittest error
      e6ec98fe
    • C
      [Phi]Interploatd kernels into phi (#40855) · d65a7a46
      chentianyu03 提交于
      * add interploate cpu kernel
      
      * fix nullptr bug
      
      * add interpolate gpu kernel
      
      * fix unit test error
      
      * remove raw kernels
      
      * add cuda kernel impl
      
      * add infermeta
      
      * recover accidentally deleted kernels in interpolate op
      
      * fix grad x_grad name error
      
      * remove interpolate_v2_op.h
      
      * rm unused codes
      
      * fix xpu build error
      
      * fix build error
      
      * fix namespace error
      
      * add register header for nup
      
      * fix infermeta error
      
      * modify by review
      
      * add the missing args in test_trt_convert_nearest_interp_v2
      d65a7a46
    • Z
      [GPUPS]fix CMakeLists with pslib (#41225) · 4da4265a
      zmxdream 提交于
      * fix cmake. test=develop
      
      * fix. test=develop
      
      * fix dep for graphs_ps_gpu. test=develop
      
      * update. test=develop
      
      * update. test=develop
      4da4265a
    • A
      [custom kernel] support fallback (#41212) · 9c2a9afd
      Aganlengzi 提交于
      9c2a9afd
    • L
      [new-exec] move WaitEvent/RecordEvent into try-catch (#41222) · 5dae6da0
      Leo Chen 提交于
      * move WaitEvent/RecordEvent into try-catch
      
      * refine supportNpu
      5dae6da0
  15. 31 3月, 2022 3 次提交