1. 18 3月, 2022 4 次提交
  2. 17 3月, 2022 27 次提交
    • C
      [Phi] Move assign kernel into phi (#40022) · 1904572a
      Chen Weihang 提交于
      * move assign kernel init commit
      
      * change vec<tensor> to vec<tensor*>
      
      * support tensor array
      
      * support api declare
      
      * fix test_list failed
      
      * fix npu and xpu failed
      
      * fix infrt failed
      
      * remove assign array size in operator
      
      * move assign sr header into sr dir
      
      * add infermeta for assign
      
      * test op success
      
      * fix test_list failed
      
      * fix kunlun failed
      
      * add set host allocator in tests
      
      * support tensor array in arg ctx
      
      * open set layout in share_meta
      
      * fix meta tensor layout error
      
      * fix test failed
      1904572a
    • S
      merge cpu and gpu graph engines (#40597) · 31776199
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      31776199
    • C
      Revert "Fix truncated norm operator (#40287)" (#40614) · 313bff6b
      Chang Xu 提交于
      This reverts commit 0c333543.
      313bff6b
    • N
    • T
      7dad9f70
    • H
      CopyFromCpu and CopyToCpu of Onnxruntime back-end optimize (#40561) · fcbb7440
      heliqi 提交于
      * add onnxruntime predictor
      
      * Add code comments
      
      * support link paddle2onnx onnxruntime
      
      * support onnxruntime with python
      
      * support onnxruntime with python
      
      * support onnxruntime with windows
      
      * paddle2onnx compile with windows
      
      * supoort windows compile
      
      * supoort windows compile with onnxruntime
      
      * supoort windows compile with paddle2onnx
      
      * supoort mac compile
      
      * compile with mac
      
      * compile with mac
      
      * add code comments
      
      * fix remind word
      
      * code optimization
      
      * add test case
      
      * add test case
      
      * add inference demo_ci test case
      
      * fix compile paddle2onnx with no python
      
      * add inference demo_ci test case
      
      * add inference demo_ci test case
      
      * add inference infer_ut test case
      
      * support c go api and test cases
      
      * add converage test case
      
      * add converage test case
      
      * add capi test case
      
      * add capi test case
      
      * fix onnxruntime copyfromcpu and copytocpu
      
      * fix goapi
      
      * modify code
      fcbb7440
    • Q
      [ROCm] fix bfloat16 support, test=develop (#40401) · da558f0e
      Qi Li 提交于
      da558f0e
    • W
      [Bug fixes] Fix partial grad conflicts (#40655) · 60899549
      Weilong Wu 提交于
      * [Eager] Support eager grad interface, draft version
      
      * Support eager grad interface with allow_unused and multi startup_op
      
      * Fix code format
      
      * Fix allow_unused case, return PyNone if tensor not initialize
      
      * Support output's stop_gradient related to create_graph
      
      * Support grad exception case in eager mode, fix coverage CI
      
      * Update ToPyObject, return PyNone if not initialize
      
      * AccumulationNode add FLAGS_retain_grad_for_all_tensor
      
      * Fix ci issue
      
      * Fix CI issue
      
      * fix, use core.eager.Tensor
      
      * Add func SetBufferSlotRankZeros for GradTensorHolder
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Update retain_graph and no_grad_vars related test case
      
      * Update code gen logic for ClearTensorWrappers
      
      * Fix by override statement
      
      * fix override func args
      
      * Support retain_graph, update unit tests
      
      * Updated ClearTensorWrappers logic
      
      * fix grad python interface
      
      * Use deep copy and update unit tests
      
      * Polish code
      
      * Polish code
      
      * Fix CI issue, Deep copy only use when user set grad_tensors
      
      * Fix CI, use Backward instead RunBackward
      
      * Fix CI, Declare kernel explicitly in test file
      
      * Polish, remove vector of TensorWrapper
      
      * Refactor the logic of grad/backward, polish codes
      
      * Update code after merge upstream develop
      
      * Polish after merge upstream develop
      
      * Update to adapt new GradNodeBase superclass
      
      * Fix error introduced during conflict resolution
      
      * Update purify potential_startup_nodes logic
      
      * Fix errors
      
      * Polish code
      
      * Remove useless args for ToPyObject
      
      * Remove useless TensorWrappersSet
      
      * Fix code-format, re-install pre-commit
      
      * Fix pre-process logic for potential_startup_ops
      
      * Update unit tests, use eager mode
      
      * Fix conflicts
      60899549
    • Y
      rename math (#40641) · 883a8eea
      YuanRisheng 提交于
      883a8eea
    • Z
      [PHI] move roi_pool kernel to phi (#40574) · 7d0db629
      zyfncg 提交于
      * move roi_pool forward kernel to phi
      
      * move roi_pool_grad to phi
      
      * fix compile bug
      
      * fix compile bug
      
      * fix register data_type
      7d0db629
    • H
      Move layer norm to phi (#40193) · 681a6865
      hong 提交于
      * update
      
      * fix bugs; test=develop
      
      * update; test=develop
      
      * fix test compile error; test=develop
      
      * fix cpu compile error; test=develop
      
      * fix test error; test=develo
      
      * fix layer_norm_op plugin error; test=develop
      
      * fix error; test=develop
      
      * fix test bug; test=develop
      
      * update; test=develop
      
      * polish code; test=develop
      
      * fix bugs; test=develop
      
      * remove unused depency; test=develop
      
      * polish code; test=develop
      681a6865
    • Z
      move infershape of set_value to phi (#40636) · c335288d
      zyfncg 提交于
      c335288d
    • Y
      move activation sigmoid (#40626) · ed8a9370
      YuanRisheng 提交于
      ed8a9370
    • Z
      [Phi]Move infershape of top_k/expand_as/kron/searchsorted to phi (#40632) · 9ee03302
      Zhang Zheng 提交于
      * [Phi]Move infershape of top_k/expand_as/kron/searchsorted to phi
      
      * add set_dtype
      
      * fix order
      9ee03302
    • N
      Replace PADDLE_WITH_XPU2 with PADDLE_WITH_KP (#40560) · c142e37d
      niuliling123 提交于
      * Replace PADDLE_WITH_XPU2 with PADDLE_WITH_KP
      c142e37d
    • Y
      [fleet executor] fleet executor for npu (#40607) · 81848fff
      Yuang Liu 提交于
      81848fff
    • B
      support gpu mixed precision inference (#40531) · 06fee998
      baoachun 提交于
      06fee998
    • X
      fix xpu compile error: introduced by scalar.cc (#40630) · ade72108
      xiongkun 提交于
      ade72108
    • W
      [Eager Grad] Support eager grad interface (#40170) · 4db8cf24
      Weilong Wu 提交于
      * [Eager] Support eager grad interface, draft version
      
      * Support eager grad interface with allow_unused and multi startup_op
      
      * Fix code format
      
      * Fix allow_unused case, return PyNone if tensor not initialize
      
      * Support output's stop_gradient related to create_graph
      
      * Support grad exception case in eager mode, fix coverage CI
      
      * Update ToPyObject, return PyNone if not initialize
      
      * AccumulationNode add FLAGS_retain_grad_for_all_tensor
      
      * Fix ci issue
      
      * Fix CI issue
      
      * fix, use core.eager.Tensor
      
      * Add func SetBufferSlotRankZeros for GradTensorHolder
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Update retain_graph and no_grad_vars related test case
      
      * Update code gen logic for ClearTensorWrappers
      
      * Fix by override statement
      
      * fix override func args
      
      * Support retain_graph, update unit tests
      
      * Updated ClearTensorWrappers logic
      
      * fix grad python interface
      
      * Use deep copy and update unit tests
      
      * Polish code
      
      * Polish code
      
      * Fix CI issue, Deep copy only use when user set grad_tensors
      
      * Fix CI, use Backward instead RunBackward
      
      * Fix CI, Declare kernel explicitly in test file
      
      * Polish, remove vector of TensorWrapper
      
      * Refactor the logic of grad/backward, polish codes
      
      * Update code after merge upstream develop
      
      * Polish after merge upstream develop
      
      * Update to adapt new GradNodeBase superclass
      
      * Fix error introduced during conflict resolution
      
      * Update purify potential_startup_nodes logic
      
      * Fix errors
      
      * Polish code
      
      * Remove useless args for ToPyObject
      
      * Remove useless TensorWrappersSet
      
      * Fix code-format, re-install pre-commit
      
      * Fix pre-process logic for potential_startup_ops
      
      * Update unit tests, use eager mode
      4db8cf24
    • 4c01763c
    • Z
      Optimize the performance of C++ API (#40640) · add304ed
      zyfncg 提交于
      * Optimize performance
      
      * optimiaze c++ api performance
      
      * remove unsed code
      
      * fix paddle throw
      
      * updata format
      add304ed
    • J
      fix copy_ problem by doing it with phi copy (#40521) · c1931beb
      Jiabin Yang 提交于
      * fix copy_ problem by doing it with phi copy
      
      * improve test coverage
      
      * refactor copy with sr kernel
      c1931beb
    • C
      move grid sample op infershape (#40625) · b1b24463
      Chen Weihang 提交于
      b1b24463
    • L
      Improve the performance of fake quantize OP (#40491) · 827b6a0e
      Leo Chen 提交于
      * Move the computation of moving average scale to device
      
      * Use register to save local maximum in a thread
      827b6a0e
    • W
      Trt engine. (#40532) · 3082ed46
      Wilber 提交于
      * infrt add trt engine
      
      * fix register
      
      * file generate
      
      * fix ci error
      
      * fix conflict
      
      * add copyright
      
      * update
      
      * update
      
      * update
      
      * update engine name
      
      * refactor trt code
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix conflict
      
      * update
      
      * fix compile with cuda
      3082ed46
    • [infrt] move pd dialect position. test=develop (#40616) · 3a256637
      王明冬 提交于
      3a256637
  3. 16 3月, 2022 9 次提交