提交 · 7ee31a96b436de4b0701de2ba56bd0b2a653994c · PaddlePaddle / Paddle

17 4月, 2022 1 次提交

[Perf] Optimize dygraph scheduling performance (#41696) · 7ee31a96

由 Chen Weihang 提交于 4月 17, 2022

* split phi and fluid infermeta context

* resolve conflict

* fix type error

* optimize scheduling perf

* spec small vector size

* replace all grad var name

* fix test failed

* move init defalut signature

* polish details

* polish details

* fix no init bug

* init sig for tests

* add init sig for infer

* fix infrt error

* fix infrt failed

* fix kunlun error

* fix infrt failed

7ee31a96

14 4月, 2022 5 次提交

Fix to #38693 (minimal UT) (#41026) · d0f3296b

由 Jacek Czaja 提交于 4月 14, 2022

* Add UT

- Added missed data_layout

- Added missing conversions

- NDHWC added

- NDHWC support in data_transform

- another fix

- condddate change

- fix

u- fix

- fix

- fix

- fix

- fix

- fix to hack

- compilation fix

- fix to automatic merge

* - reduced UT

* - fix

* - lint

* - fix to lint

d0f3296b

FC+elementwise_add (residual connection) (#41776) · 92d8d0bc

由 Sławomir Siwek 提交于 4月 14, 2022

* Change tensor name to match activation

* declare fc_eltwise_add pass

* merge conv_eltwise refactor PR

* first compilable draft

* unittest feedback tools

* Fuse pass tester

* Move IsReachable() to shared file

* 100% coverage of fuse_pass_tester.cc

* register pass

* Add bias node

* Improve unit tests / remove bias node from pattern

* improve fc_eltwiseadd_unittest

* cancel eltwise_add fuse if act is already fused

* Add elementwise_input scale

* Residual MVP

* Add new FC attrs

* Add more test cases

* Add missing op attrs

* Adapt code to new Elementwise pattern

* reuse existing fcpattern

* improve code style

* remove unused arguments

* fix typo

* remove whitespace

* remove int8 related code

* Remove attributes from base ops

* style

* style check

* Remove input from base op

* Set attribute during fuse

* ut timeout

* download and test model

* DRY

* apply feedback from review

* Style check

* fix typo

* cosmetic changes

* explicitly set residual as output

* VIT-OCR accuracy check

* trigger CI

* remove whitespaces

* fix missing data file

92d8d0bc

S

fix bug of set cuda lib in demo_ci and infer_ut (#41677) · bda4965a
由 Sing_chan 提交于 4月 14, 2022

bda4965a

add mkldnn int8 pass [step3] (#41599) · 8e2d4d30

由 baoachun 提交于 4月 14, 2022

* add mkldnn int8 pass [step3]

* Add test for compute_propagate_scales_mkldnn_pass

* update pass

* update api comment and python api
Co-authored-by: Nwozna <joanna.wozna@intel.com>

8e2d4d30

Added shuffle_channel BF16/FP32 FWD oneDNN kernel (#39756) · c7623d72

由 jakpiase 提交于 4月 14, 2022

* added shuffle_channel bf16/fp32 fwd kernel

* added missing files

* CI fix

* changed from pten to phi

* tmp save

* added reviewers suggestions

* fix for test

c7623d72

13 4月, 2022 1 次提交

init roll convert (#41689) · 14c3c450

由 feng_shuai 提交于 4月 13, 2022

* init roll convert

* add ut for roll convert

* roll convert don't support trt6.0

* fix: change ut for trt 7.0.0.1

14c3c450

12 4月, 2022 4 次提交

strided_slice (#41573) · b861022a

由 feng_shuai 提交于 4月 12, 2022

* strided_slice

* fix: compiler error because of size()

* fix: warning

* fix : warning

* init input_shape

* fix:forget punctuation

b861022a

add python share_data interface (#41626) · be4a2077

由 JingZhuangzhuang 提交于 4月 12, 2022

* add python share_data interface

* Update inference_api.cc

* Update inference_api.cc

* add python share_data interface

be4a2077

add trt supoort for slice op (#41467) · f403fb69

由 feng_shuai 提交于 4月 12, 2022

* add trt supoort for slice op

* fix:output dims bug

* fix: test

* fix:for c++ coverage

* fix:c++ coverage

* fix: fix test bug

* fix: CI test

f403fb69

J

Add possibility to test native config in mkldnn tests (#41562) · b68bb428
由 joanna.wozna.intel 提交于 4月 12, 2022

b68bb428

07 4月, 2022 3 次提交

modify inference model test build method to support multi version (#41027) · c9e0e10e

由 Sing_chan 提交于 4月 07, 2022

* change inference demo_test build method to ninja to choose visual studio version automaticly

* notest;test=windows_ci_inference

* set cuda of demo_ci by arg,fix bug of ninja compile,test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci_inference"

* set lib_path according to generator

c9e0e10e

Z

remove cudnn_deterministic=True (#41341) · cefa91fd
由 Zhang Jun 提交于 4月 07, 2022

cefa91fd
J
modify infer gpu memory strategy (#41427) · 56e72b20
由 JingZhuangzhuang 提交于 4月 07, 2022
```
* modify infer gpu memory strategy

* modify infer gpu memory strategy
```
56e72b20

06 4月, 2022 2 次提交
- F
  
  add div plugin and add filter (#41243) · 0c968b9d
  由 feng_shuai 提交于 4月 06, 2022
  
  0c968b9d
- A
  [IPU] remove paddle_ipu shared library (#41307) · 229e91bf
  由 Allen Guo 提交于 4月 06, 2022
```
* remove paddle_ipu shared library

* fix unique_name
```
  229e91bf
05 4月, 2022 1 次提交
- W
  add fake index and unittest for multiclass_nms3 trt (#41344) · 1bd8125f
  由 wangxinxin08 提交于 4月 05, 2022
```
* add fake index and unittest for multiclass_nms3 trt

* modify unittest
```
  1bd8125f
02 4月, 2022 2 次提交
- W
  [Paddle inference] support new quant_model (#41049) · 1b58ce14
  由 Wangzheee 提交于 4月 02, 2022
```
* paddle inference support new quant_model
```
  1b58ce14
- W
  filter unsupported inputs for elementwise op in op teller (#41253) · 56f108ff
  由 wangxinxin08 提交于 4月 02, 2022
```
* filter unsupported inputs for elementwise op in op teller

* add unittest for corner case
```
  56f108ff
01 4月, 2022 3 次提交

X
reshape_opteller (#41090) · 15d5f6b9
由 xiaoxiaohehe001 提交于 4月 01, 2022
```
fix_reshape: for paddle-trt
```
15d5f6b9

[Eager] Support pinned (#41035) · f3270fc8

由 wanghuancoder 提交于 4月 01, 2022

* support pinned, test=develop

* support async_write, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

f3270fc8

J

- Enabled fc of oneDNN for bert test (#41235) · 597d7efd
由 Jacek Czaja 提交于 4月 01, 2022

597d7efd

31 3月, 2022 4 次提交

W
add multiclass nms3 trt converter (#41181) · 08c3edb3
由 wangxinxin08 提交于 3月 31, 2022
```
* add multiclass_nms3 converter
```
08c3edb3

Using DistConfig in Paddle Inference (#41128) · dc0702fe

由 TeslaZhao 提交于 3月 31, 2022

* Pass compat of conv_transpose_bias_mkldnn_fuse_pass

* Fix a bug of strided_slice op, about the axes parameter access memory out of bounds

* Fix a bug of strided_slice op, about the axes parameter access memory out of bounds

* Fix a bug of transpose op, about accessing memory out of bounds of the perm param

* op:transpose_op supports bool type

* op:transpose_op supports bool type

* Keep strided_slice op behavior consistent with slice op when starts input is less than -rank

* Using DistConfig in inference

dc0702fe

add flatten2,reshape2,squueze2_trt_fuse_pass test cast (#41031) · 7ef69202

由 heliqi 提交于 3月 31, 2022

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

7ef69202

W

remove shape check (#41143) · 4b9e748a
由 wenbin 提交于 3月 31, 2022

4b9e748a

30 3月, 2022 2 次提交
- Y
  
  move elementwise_mul selected rows input (#41042) · 13f1641d
  由 YuanRisheng 提交于 3月 30, 2022
  
  13f1641d
- H
  
  Optimize the onnxruntime code (#41044) · f12b5260
  由 heliqi 提交于 3月 30, 2022
  
  f12b5260
29 3月, 2022 1 次提交
- W
  add elementwise sub and elementwise div in tensorrt op teller (#40806) · f3022dfa
  由 wangxinxin08 提交于 3月 29, 2022
```
* add elementwise sub and elementwise div in tensorrt op teller

* add unittest of elementwise mul, sub and div
```
  f3022dfa
24 3月, 2022 1 次提交

[Phi] Move mul op kernel into phi (#40833) · 1b491818

由 Chen Weihang 提交于 3月 24, 2022

* add mul phi kernel

* remove mul op kernel

* remove original mul grad op

* fix cinn test

* fix dygraph test failed

1b491818

21 3月, 2022 1 次提交
- F
  Move conv-transpose OPs to phi (#40675) · 1eb96eec
  由 From00 提交于 3月 21, 2022
```
* Move conv-transpose OPs to phi

* Fix CI errors

* Fix CI errors
```
  1eb96eec
18 3月, 2022 1 次提交
- S
  
  set +x to close showing command, update check_change code with linux (#40456) · 161d27dc
  由 Sing_chan 提交于 3月 18, 2022
  
  161d27dc
17 3月, 2022 5 次提交

CopyFromCpu and CopyToCpu of Onnxruntime back-end optimize (#40561) · fcbb7440

由 heliqi 提交于 3月 17, 2022

* add onnxruntime predictor

* Add code comments

* support link paddle2onnx onnxruntime

* support onnxruntime with python

* support onnxruntime with python

* support onnxruntime with windows

* paddle2onnx compile with windows

* supoort windows compile

* supoort windows compile with onnxruntime

* supoort windows compile with paddle2onnx

* supoort mac compile

* compile with mac

* compile with mac

* add code comments

* fix remind word

* code optimization

* add test case

* add test case

* add inference demo_ci test case

* fix compile paddle2onnx with no python

* add inference demo_ci test case

* add inference demo_ci test case

* add inference infer_ut test case

* support c go api and test cases

* add converage test case

* add converage test case

* add capi test case

* add capi test case

* fix onnxruntime copyfromcpu and copytocpu

* fix goapi

* modify code

fcbb7440

Move layer norm to phi (#40193) · 681a6865

由 hong 提交于 3月 17, 2022

* update

* fix bugs; test=develop

* update; test=develop

* fix test compile error; test=develop

* fix cpu compile error; test=develop

* fix test error; test=develo

* fix layer_norm_op plugin error; test=develop

* fix error; test=develop

* fix test bug; test=develop

* update; test=develop

* polish code; test=develop

* fix bugs; test=develop

* remove unused depency; test=develop

* polish code; test=develop

681a6865

Y

move activation sigmoid (#40626) · ed8a9370
由 YuanRisheng 提交于 3月 17, 2022

ed8a9370
Y

[fleet executor] fleet executor for npu (#40607) · 81848fff
由 Yuang Liu 提交于 3月 17, 2022

81848fff
B

support gpu mixed precision inference (#40531) · 06fee998
由 baoachun 提交于 3月 17, 2022

06fee998

15 3月, 2022 1 次提交

[Phi]Move Tanh/BRelu/LeakyRelu/ThresholdedRelu Kernels to Phi (#40385) · d7112180

由 YuanRisheng 提交于 3月 15, 2022

* move activation op

* adjust code format

* fix compile bugs

* fix ci bugs

* code format adjust

* code format adjust2

* activate ci status

* modify according to comment

* move activation kernel

* revert relu6

* reduce add code

* perfect use_phi_functor

* completing func name

* fix bugs when run ci

* fix bugs when run infr

* modifpy infrt get kernel signature

d7112180

14 3月, 2022 2 次提交

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

F
Move Pool OPs to phi (#40208) · 88ec08a7
由 From00 提交于 3月 14, 2022
```
* Move Pool OPs to phi

* Fix CI error

* Fix conflicts
```
88ec08a7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功