1. 20 4月, 2022 1 次提交
  2. 19 4月, 2022 2 次提交
  3. 18 4月, 2022 1 次提交
  4. 17 4月, 2022 1 次提交
    • C
      [Perf] Optimize dygraph scheduling performance (#41696) · 7ee31a96
      Chen Weihang 提交于
      * split phi and fluid infermeta context
      
      * resolve conflict
      
      * fix type error
      
      * optimize scheduling perf
      
      * spec small vector size
      
      * replace all grad var name
      
      * fix test failed
      
      * move init defalut signature
      
      * polish details
      
      * polish details
      
      * fix no init bug
      
      * init sig for tests
      
      * add init sig for infer
      
      * fix infrt error
      
      * fix infrt failed
      
      * fix kunlun error
      
      * fix infrt failed
      7ee31a96
  5. 14 4月, 2022 5 次提交
    • J
      Fix to #38693 (minimal UT) (#41026) · d0f3296b
      Jacek Czaja 提交于
      * Add UT
      
      - Added missed data_layout
      
      - Added missing conversions
      
      - NDHWC added
      
      - NDHWC support in data_transform
      
      - another fix
      
      - condddate change
      
      - fix
      
      u- fix
      
      - fix
      
      - fix
      
      - fix
      
      - fix
      
      - fix to hack
      
      - compilation fix
      
      - fix to automatic merge
      
      * - reduced UT
      
      * - fix
      
      * - lint
      
      * - fix to lint
      d0f3296b
    • S
      FC+elementwise_add (residual connection) (#41776) · 92d8d0bc
      Sławomir Siwek 提交于
      * Change tensor name to match activation
      
      * declare fc_eltwise_add pass
      
      * merge conv_eltwise refactor PR
      
      * first compilable draft
      
      * unittest feedback tools
      
      * Fuse pass tester
      
      * Move IsReachable() to shared file
      
      * 100% coverage of fuse_pass_tester.cc
      
      * register pass
      
      * Add bias node
      
      * Improve unit tests / remove bias node from pattern
      
      * improve fc_eltwiseadd_unittest
      
      * cancel eltwise_add fuse if act is already fused
      
      * Add elementwise_input scale
      
      * Residual MVP
      
      * Add new FC attrs
      
      * Add more test cases
      
      * Add missing op attrs
      
      * Adapt code to new Elementwise pattern
      
      * reuse existing fcpattern
      
      * improve code style
      
      * remove unused arguments
      
      * fix typo
      
      * remove whitespace
      
      * remove int8 related code
      
      * Remove attributes from base ops
      
      * style
      
      * style check
      
      * Remove input from base op
      
      * Set attribute during fuse
      
      * ut timeout
      
      * download and test model
      
      * DRY
      
      * apply feedback from review
      
      * Style check
      
      * fix typo
      
      * cosmetic changes
      
      * explicitly set residual as output
      
      * VIT-OCR accuracy check
      
      * trigger CI
      
      * remove whitespaces
      
      * fix missing data file
      92d8d0bc
    • S
      bda4965a
    • B
      add mkldnn int8 pass [step3] (#41599) · 8e2d4d30
      baoachun 提交于
      * add mkldnn int8 pass [step3]
      
      * Add test for compute_propagate_scales_mkldnn_pass
      
      * update pass
      
      * update api comment and python api
      Co-authored-by: Nwozna <joanna.wozna@intel.com>
      8e2d4d30
    • J
      Added shuffle_channel BF16/FP32 FWD oneDNN kernel (#39756) · c7623d72
      jakpiase 提交于
      * added shuffle_channel bf16/fp32 fwd kernel
      
      * added missing files
      
      * CI fix
      
      * changed from pten to phi
      
      * tmp save
      
      * added reviewers suggestions
      
      * fix for test
      c7623d72
  6. 13 4月, 2022 1 次提交
    • F
      init roll convert (#41689) · 14c3c450
      feng_shuai 提交于
      * init roll convert
      
      * add ut for roll convert
      
      * roll convert don't support trt6.0
      
      * fix: change ut for trt 7.0.0.1
      14c3c450
  7. 12 4月, 2022 4 次提交
  8. 07 4月, 2022 3 次提交
  9. 06 4月, 2022 2 次提交
  10. 05 4月, 2022 1 次提交
  11. 02 4月, 2022 2 次提交
  12. 01 4月, 2022 3 次提交
  13. 31 3月, 2022 4 次提交
    • W
      add multiclass nms3 trt converter (#41181) · 08c3edb3
      wangxinxin08 提交于
      * add multiclass_nms3 converter
      08c3edb3
    • T
      Using DistConfig in Paddle Inference (#41128) · dc0702fe
      TeslaZhao 提交于
      * Pass compat of conv_transpose_bias_mkldnn_fuse_pass
      
      * Fix a bug of strided_slice op, about the axes parameter access memory out of bounds
      
      * Fix a bug of strided_slice op, about the axes parameter access memory out of bounds
      
      * Fix a bug of transpose op, about accessing memory out of bounds of the perm param
      
      * op:transpose_op supports bool type
      
      * op:transpose_op supports bool type
      
      * Keep strided_slice op behavior consistent with slice op when starts input is less than -rank
      
      * Using DistConfig in inference
      dc0702fe
    • H
      add flatten2,reshape2,squueze2_trt_fuse_pass test cast (#41031) · 7ef69202
      heliqi 提交于
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      
      * add flatten2,reshape2,squueze2_trt_fuse_pass  test cast
      7ef69202
    • W
      remove shape check (#41143) · 4b9e748a
      wenbin 提交于
      4b9e748a
  14. 30 3月, 2022 2 次提交
  15. 29 3月, 2022 1 次提交
  16. 24 3月, 2022 1 次提交
  17. 21 3月, 2022 1 次提交
  18. 18 3月, 2022 1 次提交
  19. 17 3月, 2022 4 次提交
    • H
      CopyFromCpu and CopyToCpu of Onnxruntime back-end optimize (#40561) · fcbb7440
      heliqi 提交于
      * add onnxruntime predictor
      
      * Add code comments
      
      * support link paddle2onnx onnxruntime
      
      * support onnxruntime with python
      
      * support onnxruntime with python
      
      * support onnxruntime with windows
      
      * paddle2onnx compile with windows
      
      * supoort windows compile
      
      * supoort windows compile with onnxruntime
      
      * supoort windows compile with paddle2onnx
      
      * supoort mac compile
      
      * compile with mac
      
      * compile with mac
      
      * add code comments
      
      * fix remind word
      
      * code optimization
      
      * add test case
      
      * add test case
      
      * add inference demo_ci test case
      
      * fix compile paddle2onnx with no python
      
      * add inference demo_ci test case
      
      * add inference demo_ci test case
      
      * add inference infer_ut test case
      
      * support c go api and test cases
      
      * add converage test case
      
      * add converage test case
      
      * add capi test case
      
      * add capi test case
      
      * fix onnxruntime copyfromcpu and copytocpu
      
      * fix goapi
      
      * modify code
      fcbb7440
    • H
      Move layer norm to phi (#40193) · 681a6865
      hong 提交于
      * update
      
      * fix bugs; test=develop
      
      * update; test=develop
      
      * fix test compile error; test=develop
      
      * fix cpu compile error; test=develop
      
      * fix test error; test=develo
      
      * fix layer_norm_op plugin error; test=develop
      
      * fix error; test=develop
      
      * fix test bug; test=develop
      
      * update; test=develop
      
      * polish code; test=develop
      
      * fix bugs; test=develop
      
      * remove unused depency; test=develop
      
      * polish code; test=develop
      681a6865
    • Y
      move activation sigmoid (#40626) · ed8a9370
      YuanRisheng 提交于
      ed8a9370
    • Y
      [fleet executor] fleet executor for npu (#40607) · 81848fff
      Yuang Liu 提交于
      81848fff