提交 · 84bf5c313d112acbb96d93bbe686afc4101bdb85 · BaiXuePrincess / Paddle

10 8月, 2022 1 次提交
- X
  [Paddle Inference] Support cuda_graph. (#44878) · 84bf5c31
  由 xiaoxiaohehe001 提交于 8月 10, 2022
```
* cuda_graph

* cuda_graph_

* cuda_graph_

* cuda_graph_
```
  84bf5c31
05 8月, 2022 2 次提交

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

update trt workspace size param (#44469) · bdce552b

由 Zhang Jun 提交于 8月 05, 2022

* update trt workspace size param

* update

* update

* update

* use int64_t

* use int64_t

* upate

* update

bdce552b

04 8月, 2022 3 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

Z
[Paddle-TRT] add Rnn (#44678) · ffc8defa
由 zhoutianzi666 提交于 8月 04, 2022
```
* add rnn
```
ffc8defa
W
convert support multi block. (#44866) · b4a4eef2
由 Wilber 提交于 8月 04, 2022
```
* convert support multi block.

* update
```
b4a4eef2

02 8月, 2022 1 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

01 8月, 2022 3 次提交
- L
  unify gpu context (#44740) · 86763023
  由 Leo Chen 提交于 8月 01, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes
```
  86763023
- W
  infer context fix place error. (#44726) · 74e46a93
  由 Wilber 提交于 8月 01, 2022
```
* infer context fix place error.

* update

* update
```
  74e46a93
- H
  
  ort backend support output mutable data (#44724) · 3948c243
  由 heliqi 提交于 7月 31, 2022
  
  3948c243
29 7月, 2022 1 次提交
- M
  fused_fc_elementwise_layernorm_op support fp16 (#44710) · 856f741a
  由 ming1753 提交于 7月 29, 2022
```
* fused_fc_elementwise_layernorm support fp16

* fused_fc_elementwise_layernorm support double
```
  856f741a
28 7月, 2022 1 次提交
- H
  
  clone ort_predictor reuse session (#44703) · 72b65d6b
  由 heliqi 提交于 7月 28, 2022
  
  72b65d6b
26 7月, 2022 1 次提交
- W
  inference multi stream support handle lazy init. (#44563) · 1892a441
  由 Wilber 提交于 7月 26, 2022
```
* multi stream support handle lazy init.

* support eigen lazy init

* update

* fix ci problem
```
  1892a441
22 7月, 2022 1 次提交
- W
  
  add batch stream (#44524) · 4f86092b
  由 Wilber 提交于 7月 22, 2022
  
  4f86092b
21 7月, 2022 2 次提交
- M
  Fc fp16 (#44505) · 3e1280ea
  由 ming1753 提交于 7月 21, 2022
```
* fc support fp16

* add a ‘,’ on paddle_pass_builder.cc

* fc support fp16 on non-cuda.
```
  3e1280ea
- X
  [Paddle inference] Add conv_fusion_fp16 (#44435) · 37455714
  由 xiaoxiaohehe001 提交于 7月 21, 2022
```
* convfusionfp16

* convfusionfp16

* convfusionfp16
```
  37455714
19 7月, 2022 3 次提交
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57
- Z
  [Paddle-TRT] Shape sum fix scale (#44394) · 6fb2958e
  由 zhoutianzi666 提交于 7月 19, 2022
```
* shape sum

* add shape, sum trt layer
```
  6fb2958e
- W
  
  update (#44418) · d5f0ed4b
  由 Wilber 提交于 7月 19, 2022
  
  d5f0ed4b
18 7月, 2022 1 次提交
- Z
  [Paddle-TRT] reshape fill_constant (#44314) · b7db8457
  由 zhoutianzi666 提交于 7月 18, 2022
```
* reshape fill_constant

* commit

* commit
```
  b7db8457
15 7月, 2022 1 次提交
- R
  add fused token prune op and plugin (#44281) · d881d690
  由 RichardWooSJTU 提交于 7月 15, 2022
```
* add fused token prune op and plugin
```
  d881d690
13 7月, 2022 1 次提交
- R
  
  [CustomKernel] phi capi add inference support (#44268) · daa6cb92
  由 ronnywang 提交于 7月 13, 2022
  
  daa6cb92
12 7月, 2022 1 次提交

matmul+activation fuse pass (#43519) · 3333a439

由 Sławomir Siwek 提交于 7月 12, 2022

* add method for post ops

* format code

* gpd

* format style

* add matmul+act test

* implement matmul+activation

* whitespaces

* code style

* python code format

* Increase UT timeout

* code format

* update style

* generalize activation fuse passes

* change order

* Unify activation GPD

* Revert changes with op_act

* remove softmax mkldnn attrs

* set common name for act attributes

* whitespace

* append postops by helper function

* ut style

* revert changes related to quantization

* Reduce redundancy

* reduce number of parameters

* trigger CI

* validate attribute

* trim unit test

3333a439

11 7月, 2022 2 次提交
- Z
  Quantize shape operator (#44124) · d4372a1e
  由 Zuza Gawrysiak 提交于 7月 11, 2022
```
* Quantize shape operator

* Add shape op to propagate scales pass
```
  d4372a1e
- H
  [Inference]ort backend optimizer (#44136) · 9a3054c6
  由 heliqi 提交于 7月 11, 2022
```
* add ort clone interface

* paddle2onnx update to 1.0.0rc

* ort input_tensor use mutable data of scope
```
  9a3054c6
08 7月, 2022 1 次提交
- W
  
  Inference support mixed-precision model [3] (#44057) · 7f958728
  由 Wilber 提交于 7月 08, 2022
  
  7f958728
07 7月, 2022 1 次提交

[Windows CI] copy onnxruntime.dll to c++ test folder in windows (#44121) · 05b7ef8d

由 Sing_chan 提交于 7月 07, 2022

* copy onnxruntime.dll to c++ test folder in windows

* remove ut that failed due to onnxrumtime.dll

* test_api_impl failed of diff

* use TARGET to make sure if the test exist; use POST_BUILD to add copy command

05b7ef8d

30 6月, 2022 1 次提交
- J
  modify graph_pattern to thread_local (#43942) · 6467ca0d
  由 JingZhuangzhuang 提交于 6月 30, 2022
```
* modify graph_pattern to thread_local

* modify graph_pattern to thread_local
```
  6467ca0d
29 6月, 2022 2 次提交
- C
  add equal trt converter (#43461) · 1dbbe20e
  由 ccrrong 提交于 6月 29, 2022
```
* add comparisons trt converter
```
  1dbbe20e
- W
  inference support mixed-precision model [1]. (#43814) · c7694b82
  由 Wilber 提交于 6月 29, 2022
```
* inference add convert to mixed model ability.
```
  c7694b82
28 6月, 2022 2 次提交

Enable Bert on bfloat16 datatype (#43455) · 6d31dc93

由 Tomasz Socha 提交于 6月 28, 2022

* Remove output arguments from functions.
Replace pointers with references

* Name used bool flags

* Reorder functions

* Enable bfloat16 data type

* Give declarations some space

* Style

* Style

6d31dc93

石

fixes a bug, test=develop (#43884) · 5369378b
由石晓伟提交于 6月 28, 2022

5369378b

26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
24 6月, 2022 1 次提交
- W
  revert 40531 (#43807) · 7985407b
  由 Wilber 提交于 6月 24, 2022
```
* revert 40531

* update
```
  7985407b
23 6月, 2022 2 次提交

C
add cast trt converter (#43447) · b6bf8994
由 ccrrong 提交于 6月 23, 2022
```
* add cast trt converter
```
b6bf8994

[external reviewing] Params to int8 pass (#42625) · b8b2d6a9

由 Sylwester Fraczek 提交于 6月 22, 2022

* sylwek

prototype params to int8 pass

* trying to make warmup work

* wip

* wip

* change test to cpp test

* review fixes, refactoring

* more refactoring

* add erasevars

* change test to fixture

* rename pass

and reorder erasevars and graphsaferemovenodes

* fix

* more refactoring and fixed bug

* formatting

* remove scale count

* enfroce message too short

* remove erasevars

erasevars couldbe cauuse of memory issues

some other fixes

* add count of successfull fuses to name of new nodes

* FindVar -> GetVar and use ConvResidual pattern

* use tensor->clear() instead of new variable

* Update paddle/fluid/framework/ir/mkldnn/params_quantization_mkldnn_pass_tester.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/fluid/framework/ir/mkldnn/params_quantization_mkldnn_pass_tester.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/fluid/inference/tests/api/analyzer_lexical_analysis_gru_tester.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* add log (review fix)c

* review fix (2 functions to one)

* code review: Conv->QuantizeConv

* revert

* fix formatting

* remove unused functions

* add paddle enforce
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

b8b2d6a9

22 6月, 2022 2 次提交
- W
  Enhance gpu multihead matmul v3 fuse pass (#43529) · 561d09b9
  由 WJJ1995 提交于 6月 22, 2022
```
* fixed multihead matmul fuse pass

* Add unittests

* rm scale op

* fixed code style

* fixed code style

* resolve testcase falied

* add note
```
  561d09b9
- H
  
  paddle2onnx update to 0.9.8 (#43742) · 7bb72e37
  由 heliqi 提交于 6月 22, 2022
  
  7bb72e37
21 6月, 2022 2 次提交

[Inference]Fix the ort Backend multiple input bug (#43621) · 61591afe

由 heliqi 提交于 6月 21, 2022

* fix or backend many inputs bug

* fix or backend many inputs bug

* fix or backend many inputs bug

* fix or backend many inputs bug

* code format

* code format

61591afe

Generalize conv+activation fuse pass (#43382) · 347e4b2e

由 Sławomir Siwek 提交于 6月 21, 2022

* consolidate conv act passes

* generalize conv_activation

* integrate conv+act tests

* code style format

* whitespaces

* remove timeout from old tests

* implement comments from review

* restore ut

* whitespace

* code style

* transpose

* fixes after review

* method for gettin act

* Change Paddle_enforce error type

* code format

* add missing opcompats

347e4b2e

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致