1. 14 4月, 2022 4 次提交
    • S
      FC+elementwise_add (residual connection) (#41776) · 92d8d0bc
      Sławomir Siwek 提交于
      * Change tensor name to match activation
      
      * declare fc_eltwise_add pass
      
      * merge conv_eltwise refactor PR
      
      * first compilable draft
      
      * unittest feedback tools
      
      * Fuse pass tester
      
      * Move IsReachable() to shared file
      
      * 100% coverage of fuse_pass_tester.cc
      
      * register pass
      
      * Add bias node
      
      * Improve unit tests / remove bias node from pattern
      
      * improve fc_eltwiseadd_unittest
      
      * cancel eltwise_add fuse if act is already fused
      
      * Add elementwise_input scale
      
      * Residual MVP
      
      * Add new FC attrs
      
      * Add more test cases
      
      * Add missing op attrs
      
      * Adapt code to new Elementwise pattern
      
      * reuse existing fcpattern
      
      * improve code style
      
      * remove unused arguments
      
      * fix typo
      
      * remove whitespace
      
      * remove int8 related code
      
      * Remove attributes from base ops
      
      * style
      
      * style check
      
      * Remove input from base op
      
      * Set attribute during fuse
      
      * ut timeout
      
      * download and test model
      
      * DRY
      
      * apply feedback from review
      
      * Style check
      
      * fix typo
      
      * cosmetic changes
      
      * explicitly set residual as output
      
      * VIT-OCR accuracy check
      
      * trigger CI
      
      * remove whitespaces
      
      * fix missing data file
      92d8d0bc
    • S
      bda4965a
    • B
      add mkldnn int8 pass [step3] (#41599) · 8e2d4d30
      baoachun 提交于
      * add mkldnn int8 pass [step3]
      
      * Add test for compute_propagate_scales_mkldnn_pass
      
      * update pass
      
      * update api comment and python api
      Co-authored-by: Nwozna <joanna.wozna@intel.com>
      8e2d4d30
    • J
      Added shuffle_channel BF16/FP32 FWD oneDNN kernel (#39756) · c7623d72
      jakpiase 提交于
      * added shuffle_channel bf16/fp32 fwd kernel
      
      * added missing files
      
      * CI fix
      
      * changed from pten to phi
      
      * tmp save
      
      * added reviewers suggestions
      
      * fix for test
      c7623d72
  2. 13 4月, 2022 1 次提交
    • F
      init roll convert (#41689) · 14c3c450
      feng_shuai 提交于
      * init roll convert
      
      * add ut for roll convert
      
      * roll convert don't support trt6.0
      
      * fix: change ut for trt 7.0.0.1
      14c3c450
  3. 12 4月, 2022 4 次提交
  4. 07 4月, 2022 3 次提交
  5. 06 4月, 2022 2 次提交
  6. 05 4月, 2022 1 次提交
  7. 02 4月, 2022 2 次提交
  8. 01 4月, 2022 3 次提交
  9. 31 3月, 2022 4 次提交
    • W
      add multiclass nms3 trt converter (#41181) · 08c3edb3
      wangxinxin08 提交于
      * add multiclass_nms3 converter
      08c3edb3
    • T
      Using DistConfig in Paddle Inference (#41128) · dc0702fe
      TeslaZhao 提交于
      * Pass compat of conv_transpose_bias_mkldnn_fuse_pass
      
      * Fix a bug of strided_slice op, about the axes parameter access memory out of bounds
      
      * Fix a bug of strided_slice op, about the axes parameter access memory out of bounds
      
      * Fix a bug of transpose op, about accessing memory out of bounds of the perm param
      
      * op:transpose_op supports bool type
      
      * op:transpose_op supports bool type
      
      * Keep strided_slice op behavior consistent with slice op when starts input is less than -rank
      
      * Using DistConfig in inference
      dc0702fe
    • H
      add flatten2,reshape2,squueze2_trt_fuse_pass test cast (#41031) · 7ef69202
      heliqi 提交于
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      7ef69202
    • W
      remove shape check (#41143) · 4b9e748a
      wenbin 提交于
      4b9e748a
  10. 30 3月, 2022 2 次提交
  11. 29 3月, 2022 1 次提交
  12. 24 3月, 2022 1 次提交
  13. 21 3月, 2022 1 次提交
  14. 18 3月, 2022 1 次提交
  15. 17 3月, 2022 5 次提交
    • H
      CopyFromCpu and CopyToCpu of Onnxruntime back-end optimize (#40561) · fcbb7440
      heliqi 提交于
      * add onnxruntime predictor
      
      * Add code comments
      
      * support link paddle2onnx onnxruntime
      
      * support onnxruntime with python
      
      * support onnxruntime with python
      
      * support onnxruntime with windows
      
      * paddle2onnx compile with windows
      
      * supoort windows compile
      
      * supoort windows compile with onnxruntime
      
      * supoort windows compile with paddle2onnx
      
      * supoort mac compile
      
      * compile with mac
      
      * compile with mac
      
      * add code comments
      
      * fix remind word
      
      * code optimization
      
      * add test case
      
      * add test case
      
      * add inference demo_ci test case
      
      * fix compile paddle2onnx with no python
      
      * add inference demo_ci test case
      
      * add inference demo_ci test case
      
      * add inference infer_ut test case
      
      * support c go api and test cases
      
      * add converage test case
      
      * add converage test case
      
      * add capi test case
      
      * add capi test case
      
      * fix onnxruntime copyfromcpu and copytocpu
      
      * fix goapi
      
      * modify code
      fcbb7440
    • H
      Move layer norm to phi (#40193) · 681a6865
      hong 提交于
      * update
      
      * fix bugs; test=develop
      
      * update; test=develop
      
      * fix test compile error; test=develop
      
      * fix cpu compile error; test=develop
      
      * fix test error; test=develo
      
      * fix layer_norm_op plugin error; test=develop
      
      * fix error; test=develop
      
      * fix test bug; test=develop
      
      * update; test=develop
      
      * polish code; test=develop
      
      * fix bugs; test=develop
      
      * remove unused depency; test=develop
      
      * polish code; test=develop
      681a6865
    • Y
      move activation sigmoid (#40626) · ed8a9370
      YuanRisheng 提交于
      ed8a9370
    • Y
      [fleet executor] fleet executor for npu (#40607) · 81848fff
      Yuang Liu 提交于
      81848fff
    • B
      support gpu mixed precision inference (#40531) · 06fee998
      baoachun 提交于
      06fee998
  16. 15 3月, 2022 1 次提交
    • Y
      [Phi]Move Tanh/BRelu/LeakyRelu/ThresholdedRelu Kernels to Phi (#40385) · d7112180
      YuanRisheng 提交于
      * move activation op
      
      * adjust code format
      
      * fix compile bugs
      
      * fix ci bugs
      
      * code format adjust
      
      * code format adjust2
      
      * activate ci status
      
      * modify according to comment
      
      * move activation kernel
      
      * revert relu6
      
      * reduce add code
      
      * perfect use_phi_functor
      
      * completing func name
      
      * fix bugs when run ci
      
      * fix bugs when run infr
      
      * modifpy infrt get kernel signature
      d7112180
  17. 14 3月, 2022 2 次提交
    • T
      Add an elementwise + activation fusion pass. (#36541) · 3f219160
      Tomasz Socha 提交于
      * Add elementwise add and activation fuse pass
      
      * Fix copy ellision
      
      * More flexible pattern detector
      
      * More flexible fusion pass
      
      * Update lists for pass
      
      * Add support for Pow operator
      
      * Add support for more activation types
      
      * Style
      
      * Rename fusion pass
      
      * First version of tests
      
      * Dirty version of pass
      
      * Polished version
      
      * Update pbtxt
      
      * Style
      
      * Update names
      
      * Style
      
      * Use PADDLE_ENFORCE_EQ
      
      * Save error message to variable
      
      * WO for error checks
      
      * CR
      
      * Static style check
      
      * Add missing 'activation_scale' attribute
      
      * Add relu6 and sigmoid activations
      
      * Style
      
      * Fix fuse list formating
      
      * Sync filenames for fuse pass files
      
      * Fix cmake after move
      
      * Fix registration
      
      * Fix pass name in tests
      
      * Add missing activations to checker
      
      * WIPS
      
      * Working mul op
      
      * Working sub
      
      * Working Add
      
      * Remove pten includes
      
      * Remove some forward declarations
      
      * Remove Includes
      
      * Fixes
      
      * Remove default kernels
      
      * Add check if post_ops attributes are avaliable
      
      * Style
      
      * Code adjustment
      
      * Register default kernels
      
      * We have year 2022 not 2021...
      Co-authored-by: Njakpiase <jakpia21@gmail.com>
      Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>
      
      * Fast review fixes
      Co-authored-by: Njakpiase <jakpia21@gmail.com>
      Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>
      
      * Review Fix
      
      * Rename one_dnn -> onednn
      
      * Style after review
      
      * Fast and dirty fix for quantization
      
      * Update tests
      
      * Style
      
      * Fix mkldnn_quantizer config
      
      * Add Joanna's suggestion.
      
      * Check if operator is explicitly disables on OneDNN
      
      * Try to use unregistered attributes
      
      * Style
      
      * Test new framework
      
      * FXI
      
      * FXII
      
      * Update test
      
      * Style
      Co-authored-by: Njakpiase <jakpia21@gmail.com>
      Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>
      3f219160
    • F
      Move Pool OPs to phi (#40208) · 88ec08a7
      From00 提交于
      * Move Pool OPs to phi
      
      * Fix CI error
      
      * Fix conflicts
      88ec08a7
  18. 10 3月, 2022 2 次提交
    • H
      Inference add ONNXRuntime back-end (#39988) · 431afc39
      heliqi 提交于
      * add onnxruntime predictor
      
      * Add code comments
      
      * support link paddle2onnx onnxruntime
      
      * support onnxruntime with python
      
      * support onnxruntime with python
      
      * support onnxruntime with windows
      
      * paddle2onnx compile with windows
      
      * supoort windows compile
      
      * supoort windows compile with onnxruntime
      
      * supoort windows compile with paddle2onnx
      
      * supoort mac compile
      
      * compile with mac
      
      * compile with mac
      
      * add code comments
      
      * fix remind word
      
      * code optimization
      
      * add test case
      
      * add test case
      
      * add inference demo_ci test case
      
      * fix compile paddle2onnx with no python
      
      * add inference demo_ci test case
      
      * add inference demo_ci test case
      
      * add inference infer_ut test case
      
      * support c go api and test cases
      
      * add converage test case
      
      * add converage test case
      
      * add capi test case
      
      * add capi test case
      431afc39
    • H
      Move dropout to phi (#40148) · 99fc1b08
      hong 提交于
      * move dropout to phi; test=develop
      
      * fix xpu, npu compile error; test=develop
      99fc1b08