1. 17 3月, 2022 10 次提交
    • H
      Move layer norm to phi (#40193) · 681a6865
      hong 提交于
      * update
      
      * fix bugs; test=develop
      
      * update; test=develop
      
      * fix test compile error; test=develop
      
      * fix cpu compile error; test=develop
      
      * fix test error; test=develo
      
      * fix layer_norm_op plugin error; test=develop
      
      * fix error; test=develop
      
      * fix test bug; test=develop
      
      * update; test=develop
      
      * polish code; test=develop
      
      * fix bugs; test=develop
      
      * remove unused depency; test=develop
      
      * polish code; test=develop
      681a6865
    • Z
      move infershape of set_value to phi (#40636) · c335288d
      zyfncg 提交于
      c335288d
    • Y
      move activation sigmoid (#40626) · ed8a9370
      YuanRisheng 提交于
      ed8a9370
    • Z
      [Phi]Move infershape of top_k/expand_as/kron/searchsorted to phi (#40632) · 9ee03302
      Zhang Zheng 提交于
      * [Phi]Move infershape of top_k/expand_as/kron/searchsorted to phi
      
      * add set_dtype
      
      * fix order
      9ee03302
    • Y
      [fleet executor] fleet executor for npu (#40607) · 81848fff
      Yuang Liu 提交于
      81848fff
    • B
      support gpu mixed precision inference (#40531) · 06fee998
      baoachun 提交于
      06fee998
    • W
      [Eager Grad] Support eager grad interface (#40170) · 4db8cf24
      Weilong Wu 提交于
      * [Eager] Support eager grad interface, draft version
      
      * Support eager grad interface with allow_unused and multi startup_op
      
      * Fix code format
      
      * Fix allow_unused case, return PyNone if tensor not initialize
      
      * Support output's stop_gradient related to create_graph
      
      * Support grad exception case in eager mode, fix coverage CI
      
      * Update ToPyObject, return PyNone if not initialize
      
      * AccumulationNode add FLAGS_retain_grad_for_all_tensor
      
      * Fix ci issue
      
      * Fix CI issue
      
      * fix, use core.eager.Tensor
      
      * Add func SetBufferSlotRankZeros for GradTensorHolder
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Support retain_graph by using ClearTensorWrappers
      
      * Update retain_graph and no_grad_vars related test case
      
      * Update code gen logic for ClearTensorWrappers
      
      * Fix by override statement
      
      * fix override func args
      
      * Support retain_graph, update unit tests
      
      * Updated ClearTensorWrappers logic
      
      * fix grad python interface
      
      * Use deep copy and update unit tests
      
      * Polish code
      
      * Polish code
      
      * Fix CI issue, Deep copy only use when user set grad_tensors
      
      * Fix CI, use Backward instead RunBackward
      
      * Fix CI, Declare kernel explicitly in test file
      
      * Polish, remove vector of TensorWrapper
      
      * Refactor the logic of grad/backward, polish codes
      
      * Update code after merge upstream develop
      
      * Polish after merge upstream develop
      
      * Update to adapt new GradNodeBase superclass
      
      * Fix error introduced during conflict resolution
      
      * Update purify potential_startup_nodes logic
      
      * Fix errors
      
      * Polish code
      
      * Remove useless args for ToPyObject
      
      * Remove useless TensorWrappersSet
      
      * Fix code-format, re-install pre-commit
      
      * Fix pre-process logic for potential_startup_ops
      
      * Update unit tests, use eager mode
      4db8cf24
    • J
      fix copy_ problem by doing it with phi copy (#40521) · c1931beb
      Jiabin Yang 提交于
      * fix copy_ problem by doing it with phi copy
      
      * improve test coverage
      
      * refactor copy with sr kernel
      c1931beb
    • C
      move grid sample op infershape (#40625) · b1b24463
      Chen Weihang 提交于
      b1b24463
    • L
      Improve the performance of fake quantize OP (#40491) · 827b6a0e
      Leo Chen 提交于
      * Move the computation of moving average scale to device
      
      * Use register to save local maximum in a thread
      827b6a0e
  2. 16 3月, 2022 21 次提交
  3. 15 3月, 2022 9 次提交
    • C
      [Phi] Move determinant op kernel into phi (#40539) · a04a6bd5
      Chen Weihang 提交于
      * add determinant phi kernel
      
      * remove original determinant op kernel
      
      * add determinant grad [hi kernel
      
      * fix determinant test failed
      
      * remove original determinant grad op kernel
      a04a6bd5
    • L
      [phi] modify the shape OP and move inferMeta of shape,matrix_pow,multi_dot (#40506) · 31729a62
      Liu-xiandong 提交于
      * [phi] move matrix_power op
      
      * MatrixInverse fluid -> phi
      
      * modify the CMake to fix compile bug
      
      * delete useless comment
      
      * mutable memory -> phi Alloc
      
      * modify the include file
      
      * modify the include file
      
      * fix bug in CI compiler
      
      * [phi]modify the shape OP and move inferMeta of shape,matrix_pow,multi_dot
      
      * delete useless comment
      
      * fix bug in CI
      
      * modify after review
      31729a62
    • R
      add number count op (#39224) · 9bdee437
      Roc 提交于
      * add expert count op
      
      add ut for expert_count
      
      * update UT only for cuda
      
      * fix for rocm
      
      * update ut
      
      * add moe module
      
      * add expert count op
      
      add ut for expert_count
      
      * update UT only for cuda
      
      * update ut
      
      * add moe module
      
      * make expert count private
      
      * rename expert count op
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      9bdee437
    • X
      run python api in eager model and filter the out in argument list (#40523) · 4d886f75
      xiongkun 提交于
      * run python api in eager model and filter the out in argument list
      
      * fix code
      4d886f75
    • Z
      Fixed issues with generated scale operator (#40482) · 30417999
      Zhanlue Yang 提交于
      * Fixed issues with generated scale operator
      
      * Fixed minor issues
      30417999
    • F
      [NPU] add AMP O1 support (#40362) · 69dd43d1
      furnace 提交于
      * [NPU] add AMP O1 support
      
      * [NPU] fix NOTE and warnings
      69dd43d1
    • C
      [Phi] Move gather op kernel into phi (#40500) · 0c703fe7
      Chen Weihang 提交于
      * add phi gather kernel
      
      * update year
      
      * remove original gather opkernel
      
      * add gather grad phi kernels
      
      * remove origin gather grad kernel
      
      * fix failed npu and xpu
      
      * fix xpu compile failed
      0c703fe7
    • J
      oneDNN NHWC fixes (#40049) · dde9cec0
      Jacek Czaja 提交于
      * - Prototype of third solution
      
      - fix
      
      - compilation fixes
      
      - fix
      
      - fixe
      
      - fix
      
      - fix
      
      - compilation fix
      
      - comment fix
      
      - lint
      
      update mkldnn conv_elementwise_add_fuse_pass ut
      
      - NHWC changes to prelu
      
      - alhpa dims
      
      - UT fix
      
      - fix to UT
      
      - lint
      
      - Some fixes
      
      - added to BWD of prelu NHWC support
      
      - reverted removal of resetting cu_layout in clearing of caching
      
      * - Small changes
      
      * - compilation fix
      
      * - fix
      
      * - fix
      
      * lint
      
      * - fixes after internal review
      
      * - compilation fix
      
      * - lint
      dde9cec0
    • T
      add shard_id (#40261) · 6b7d4845
      Thunderbrook 提交于
      * shard_id
      
      * format
      6b7d4845