1. 27 11月, 2021 1 次提交
    • A
      [NPU] reorganization for device API abstraction (#37110) · 72241a6a
      Aganlengzi 提交于
      * [NPU] reorganization for device API abstraction
      
      * [NPU] delete old files
      
      * [NPU] fix npu_collective_helper
      
      * [NPU] fix collective_helper
      
      * [NPU] fix ut
      
      * [NPU] mod memory allocation and hccl_helper
      
      * [NPU] fix place_type
      
      * [NPU] split enfoce.h
      
      * move acl* call into npu_info
      
      * merge conflict
      
      * fix merge
      
      * merge conflict
      
      * merge conflict
      72241a6a
  2. 24 11月, 2021 1 次提交
  3. 19 11月, 2021 1 次提交
  4. 15 11月, 2021 1 次提交
    • C
      [Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a
      Chen Weihang 提交于
      * move extension into pten [no-verify]
      
      * append tensor methods by ext_tensor [no-verify]
      
      * append other tensor methods [no-verify]
      
      * ext related files tidy [no-verify]
      
      * include relation tidy [no-verify]
      
      * add pten tensor test [no-verify]
      
      * replace tensor in custom op & compile success
      
      * refine tensor constructor for unittest
      
      * custom relu jit run success
      
      * fix all custom op unittests
      
      * add inference cmake adapt [no-verify]
      
      * fix failed unittests
      
      * fix windows failed unittests
      
      * try to fix kunlun and inference failed
      
      * fix test_elementwise_api error
      
      * try to fix win compile failed
      
      * fix kunlun fp16 type error
      
      * remove useless haddle error macro
      
      * add custom linear op test
      
      * fix compile failed & add win symbols
      
      * fix non pten kernel cast failed
      
      * add dll decl for api
      
      * polish several deetails
      
      * polish details by review comment
      
      * add dll_decl for register
      1e598f1a
  5. 11 11月, 2021 1 次提交
    • J
      Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc
      jakpiase 提交于
      * added softplus + activation fuse plass
      
      * minor change
      
      * implemented reviewer suggestion
      
      * minor fix
      
      * minor fix
      
      * added scale_out parameter
      
      * minor fix
      
      * fix for iScan CI
      
      * conditionally disabled logs
      
      * refactored pass builder
      a346c4dc
  6. 02 11月, 2021 1 次提交
  7. 27 10月, 2021 3 次提交
  8. 26 10月, 2021 2 次提交
    • W
      [Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul,... · 93c591e2
      Wangzheee 提交于
      [Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul, mul) convert pass, fix (matmul, mul) op_teller (#36652)
      
      * new_Matmul2ToMatmulToMul
      
      * new_Matmul2ToMatmulToMul
      
      * fix paddle_pass_builder
      
      * fix paddle_pass_builder
      
      * fix paddle_pass_builder
      
      * tem
      
      * tem
      
      * Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass
      
      * Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass
      
      * add matmul_broadcast_unitest
      
      * fix op_teller
      93c591e2
    • F
      Pool3d 2.0 (#36545) · 229bae81
      feng_shuai 提交于
      229bae81
  9. 22 10月, 2021 1 次提交
  10. 21 10月, 2021 1 次提交
    • J
      Added matmul_v2+transpose+reshape fuse pass (#36481) · 856cb9c5
      jakpiase 提交于
      * added base changes for matmul_v2+trans+resh fuse pass
      
      * added full matmul_v2+transpose+reshape pass
      
      * removed a file added by mistake
      
      * added reviewers suggestions
      
      * Changed ops type in checking capatibility version
      
      * Deteled one statement
      856cb9c5
  11. 20 10月, 2021 1 次提交
    • S
      Add FasterTokenizer Operator (#34491) · 3f2d6a3f
      Steffy-zxf 提交于
      Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent.
      
      * support the text string as an input Tensor
      * support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens
      * Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization.
      * It first applies basic tokenization, followed by wordpiece tokenization.
      3f2d6a3f
  12. 19 10月, 2021 2 次提交
  13. 14 10月, 2021 1 次提交
  14. 13 10月, 2021 1 次提交
    • W
      [PaddleInference] Pass: add int8 flag for op (#36042) · d7858c99
      Wangzheee 提交于
      * add_int_pass
      
      * add_int8_flag_pass
      
      * add_int8_flag_pass
      
      * fix CMakeLists.txt
      
      * fix test_trt_fc_fuse_quant_dequant_pass.py
      
      * fix python/paddle/fluid/tests/unittests/ir/inference/test_trt_fc_fuse_quant_dequant_pass.py
      
      * fix test_trt_fc_fuse_quant_dequant_pass.py
      d7858c99
  15. 11 10月, 2021 1 次提交
    • W
      add mish trt plugin (#34123) · 2b7b752a
      wangxinxin08 提交于
      * add mish trt plugin, compile & install success, run error. test=develop
      * modify code according to review
      * add TRT_NOEXCEPT for mish trt plugin
      * add unittest for mish trt plugin
      * remove unnecessary check of mish in op_teller.cc
      * fix some problem of trt8
      * add check and modify unittest while converting mish to trt plugin
      Co-authored-by: Ndengkaipeng <dengkaipeng@baidu.com>
      2b7b752a
  16. 23 9月, 2021 1 次提交
  17. 22 9月, 2021 2 次提交
  18. 18 9月, 2021 1 次提交
  19. 15 9月, 2021 2 次提交
  20. 14 9月, 2021 2 次提交
  21. 10 9月, 2021 1 次提交
    • W
      conv3d (#35507) · 42847d2e
      wenbin 提交于
      * conv3d
      
      * remove const_cast
      
      * modify ut
      
      * disable dynamic shape for trt6.0
      
      * remove trt5
      42847d2e
  22. 06 9月, 2021 2 次提交
  23. 04 9月, 2021 1 次提交
  24. 31 8月, 2021 1 次提交
  25. 27 8月, 2021 2 次提交
  26. 26 8月, 2021 1 次提交
    • S
      Add copy from tensor (#34406) · ac33c0ca
      Shang Zhizhou 提交于
      * add api
      
      * temp save
      
      * revert
      
      * copytocpu async ok
      
      * fix style
      
      * copy sync ok
      
      * fix compile error
      
      * fix compile error
      
      * api done
      
      * update python async api
      
      * fix compile
      
      * remove async python api; add c++ async unittest
      
      * remove python async api
      
      * update unittest
      
      * update unittest
      
      * add C++ unittest for copytensor
      
      * add unittest
      
      * update namespace utils to class TensorUtils
      
      * add unittest
      
      * update unittest
      
      * update unittest
      
      * update code style
      
      * update code style
      
      * update unittest
      ac33c0ca
  27. 18 8月, 2021 1 次提交
  28. 12 8月, 2021 1 次提交
  29. 06 8月, 2021 1 次提交
  30. 05 8月, 2021 1 次提交
  31. 29 7月, 2021 1 次提交
    • W
      Tile supported (#34388) · cffa15c5
      wenbin 提交于
      * tile op
      
      * more uts
      
      * disable tile if trt6.0
      
      * typo
      
      * fix timeout issue
      
      * opteller
      
      * opteller remove duplicate code
      
      * comments.	test=document_fix
      
      * modify PADDLE_ENFORCE.
      
      * fix reduce_mean issue
      cffa15c5