1. 17 3月, 2022 13 次提交
    • B
      support gpu mixed precision inference (#40531) · 06fee998
      baoachun 提交于
      06fee998
    • P
      update; test=develop · 6c7a03bd
      phlrain 提交于
      6c7a03bd
    • X
      fix xpu compile error: introduced by scalar.cc (#40630) · ade72108
      xiongkun 提交于
      ade72108
    • W
      [Eager Grad] Support eager grad interface (#40170) · 4db8cf24
      Weilong Wu 提交于
      * [Eager] Support eager grad interface, draft version
      
      * Support eager grad interface with allow_unused and multi startup_op
      
      * Fix code format
      
      * Fix allow_unused case, return PyNone if tensor not initialize
      
      * Support output's stop_gradient related to create_graph
      
      * Support grad exception case in eager mode, fix coverage CI
      
      * Update ToPyObject, return PyNone if not initialize
      
      * AccumulationNode add FLAGS_retain_grad_for_all_tensor
      
      * Fix ci issue
      
      * Fix CI issue
      
      * fix, use core.eager.Tensor
      
      * Add func SetBufferSlotRankZeros for GradTensorHolder
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Update retain_graph and no_grad_vars related test case
      
      * Update code gen logic for ClearTensorWrappers
      
      * Fix by override statement
      
      * fix override func args
      
      * Support retain_graph, update unit tests
      
      * Updated ClearTensorWrappers logic
      
      * fix grad python interface
      
      * Use deep copy and update unit tests
      
      * Polish code
      
      * Polish code
      
      * Fix CI issue, Deep copy only use when user set grad_tensors
      
      * Fix CI, use Backward instead RunBackward
      
      * Fix CI, Declare kernel explicitly in test file
      
      * Polish, remove vector of TensorWrapper
      
      * Refactor the logic of grad/backward, polish codes
      
      * Update code after merge upstream develop
      
      * Polish after merge upstream develop
      
      * Update to adapt new GradNodeBase superclass
      
      * Fix error introduced during conflict resolution
      
      * Update purify potential_startup_nodes logic
      
      * Fix errors
      
      * Polish code
      
      * Remove useless args for ToPyObject
      
      * Remove useless TensorWrappersSet
      
      * Fix code-format, re-install pre-commit
      
      * Fix pre-process logic for potential_startup_ops
      
      * Update unit tests, use eager mode
      4db8cf24
    • 4c01763c
    • Z
      Optimize the performance of C++ API (#40640) · add304ed
      zyfncg 提交于
      * Optimize performance
      
      * optimiaze c++ api performance
      
      * remove unsed code
      
      * fix paddle throw
      
      * updata format
      add304ed
    • J
      fix copy_ problem by doing it with phi copy (#40521) · c1931beb
      Jiabin Yang 提交于
      * fix copy_ problem by doing it with phi copy
      
      * improve test coverage
      
      * refactor copy with sr kernel
      c1931beb
    • C
      move grid sample op infershape (#40625) · b1b24463
      Chen Weihang 提交于
      b1b24463
    • L
      Improve the performance of fake quantize OP (#40491) · 827b6a0e
      Leo Chen 提交于
      * Move the computation of moving average scale to device
      
      * Use register to save local maximum in a thread
      827b6a0e
    • W
      Trt engine. (#40532) · 3082ed46
      Wilber 提交于
      * infrt add trt engine
      
      * fix register
      
      * file generate
      
      * fix ci error
      
      * fix conflict
      
      * add copyright
      
      * update
      
      * update
      
      * update
      
      * update engine name
      
      * refactor trt code
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix conflict
      
      * update
      
      * fix compile with cuda
      3082ed46
    • [infrt] move pd dialect position. test=develop (#40616) · 3a256637
      王明冬 提交于
      3a256637
    • P
      update · def33631
      phlrain 提交于
      def33631
  2. 16 3月, 2022 27 次提交