提交 · d7d9807e8dd45ca11da43c6d0bfd7b84819465b4 · PaddlePaddle / Paddle

02 9月, 2022 1 次提交
- S
  
  enable add passes pre-calculating scales/quantizing weights (#44680) · cdb36da4
  由 Sylwester Fraczek 提交于 9月 02, 2022
  
  cdb36da4
30 8月, 2022 1 次提交
- Z
  [Paddle-TRT] constant-folding (#45494) · 97f43a8e
  由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
  97f43a8e
29 8月, 2022 1 次提交
- Y
  
  TensorRT Engine context memory bind with predictor id (#45468) · 02621079
  由 Yuanle Liu 提交于 8月 29, 2022
  
  02621079
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
18 8月, 2022 2 次提交

[inference]predictor add GetInputType interface (#45143) · a8ae87f1

由 heliqi 提交于 8月 18, 2022

* predictor add GetInputType interface

* predictor change GetInputType to GetInputTypes

* predictor add tester

* predictor add tester

* predictor change GetInputType to GetInputTypes

* predictor change GetInputType to GetInputTypes

* predictor add tester

a8ae87f1

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

memoptim and fp16 mixed precision (#45132) · fa890092
由 Wilber 提交于 8月 16, 2022

fa890092

15 8月, 2022 1 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0
14 8月, 2022 1 次提交
- X
  Revert "[Paddle Inference] Support cuda_graph. (#44878)" (#45115) · b0e7681f
  由 xiaoxiaohehe001 提交于 8月 14, 2022
```
This reverts commit 84bf5c31.
```
  b0e7681f
10 8月, 2022 1 次提交
- X
  [Paddle Inference] Support cuda_graph. (#44878) · 84bf5c31
  由 xiaoxiaohehe001 提交于 8月 10, 2022
```
* cuda_graph

* cuda_graph_

* cuda_graph_

* cuda_graph_
```
  84bf5c31
05 8月, 2022 2 次提交

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

update trt workspace size param (#44469) · bdce552b

由 Zhang Jun 提交于 8月 05, 2022

* update trt workspace size param

* update

* update

* update

* use int64_t

* use int64_t

* upate

* update

bdce552b

04 8月, 2022 3 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

Z
[Paddle-TRT] add Rnn (#44678) · ffc8defa
由 zhoutianzi666 提交于 8月 04, 2022
```
* add rnn
```
ffc8defa
W
convert support multi block. (#44866) · b4a4eef2
由 Wilber 提交于 8月 04, 2022
```
* convert support multi block.

* update
```
b4a4eef2

02 8月, 2022 1 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

01 8月, 2022 3 次提交
- L
  unify gpu context (#44740) · 86763023
  由 Leo Chen 提交于 8月 01, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes
```
  86763023
- W
  infer context fix place error. (#44726) · 74e46a93
  由 Wilber 提交于 8月 01, 2022
```
* infer context fix place error.

* update

* update
```
  74e46a93
- H
  
  ort backend support output mutable data (#44724) · 3948c243
  由 heliqi 提交于 7月 31, 2022
  
  3948c243
29 7月, 2022 1 次提交
- M
  fused_fc_elementwise_layernorm_op support fp16 (#44710) · 856f741a
  由 ming1753 提交于 7月 29, 2022
```
* fused_fc_elementwise_layernorm support fp16

* fused_fc_elementwise_layernorm support double
```
  856f741a
28 7月, 2022 1 次提交
- H
  
  clone ort_predictor reuse session (#44703) · 72b65d6b
  由 heliqi 提交于 7月 28, 2022
  
  72b65d6b
26 7月, 2022 1 次提交
- W
  inference multi stream support handle lazy init. (#44563) · 1892a441
  由 Wilber 提交于 7月 26, 2022
```
* multi stream support handle lazy init.

* support eigen lazy init

* update

* fix ci problem
```
  1892a441
22 7月, 2022 1 次提交
- W
  
  add batch stream (#44524) · 4f86092b
  由 Wilber 提交于 7月 22, 2022
  
  4f86092b
21 7月, 2022 2 次提交
- M
  Fc fp16 (#44505) · 3e1280ea
  由 ming1753 提交于 7月 21, 2022
```
* fc support fp16

* add a ‘,’ on paddle_pass_builder.cc

* fc support fp16 on non-cuda.
```
  3e1280ea
- X
  [Paddle inference] Add conv_fusion_fp16 (#44435) · 37455714
  由 xiaoxiaohehe001 提交于 7月 21, 2022
```
* convfusionfp16

* convfusionfp16

* convfusionfp16
```
  37455714
19 7月, 2022 3 次提交
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57
- Z
  [Paddle-TRT] Shape sum fix scale (#44394) · 6fb2958e
  由 zhoutianzi666 提交于 7月 19, 2022
```
* shape sum

* add shape, sum trt layer
```
  6fb2958e
- W
  
  update (#44418) · d5f0ed4b
  由 Wilber 提交于 7月 19, 2022
  
  d5f0ed4b
18 7月, 2022 1 次提交
- Z
  [Paddle-TRT] reshape fill_constant (#44314) · b7db8457
  由 zhoutianzi666 提交于 7月 18, 2022
```
* reshape fill_constant

* commit

* commit
```
  b7db8457
15 7月, 2022 1 次提交
- R
  add fused token prune op and plugin (#44281) · d881d690
  由 RichardWooSJTU 提交于 7月 15, 2022
```
* add fused token prune op and plugin
```
  d881d690
13 7月, 2022 1 次提交
- R
  
  [CustomKernel] phi capi add inference support (#44268) · daa6cb92
  由 ronnywang 提交于 7月 13, 2022
  
  daa6cb92
12 7月, 2022 1 次提交

matmul+activation fuse pass (#43519) · 3333a439

由 Sławomir Siwek 提交于 7月 12, 2022

* add method for post ops

* format code

* gpd

* format style

* add matmul+act test

* implement matmul+activation

* whitespaces

* code style

* python code format

* Increase UT timeout

* code format

* update style

* generalize activation fuse passes

* change order

* Unify activation GPD

* Revert changes with op_act

* remove softmax mkldnn attrs

* set common name for act attributes

* whitespace

* append postops by helper function

* ut style

* revert changes related to quantization

* Reduce redundancy

* reduce number of parameters

* trigger CI

* validate attribute

* trim unit test

3333a439

11 7月, 2022 2 次提交
- Z
  Quantize shape operator (#44124) · d4372a1e
  由 Zuza Gawrysiak 提交于 7月 11, 2022
```
* Quantize shape operator

* Add shape op to propagate scales pass
```
  d4372a1e
- H
  [Inference]ort backend optimizer (#44136) · 9a3054c6
  由 heliqi 提交于 7月 11, 2022
```
* add ort clone interface

* paddle2onnx update to 1.0.0rc

* ort input_tensor use mutable data of scope
```
  9a3054c6
08 7月, 2022 1 次提交
- W
  
  Inference support mixed-precision model [3] (#44057) · 7f958728
  由 Wilber 提交于 7月 08, 2022
  
  7f958728
07 7月, 2022 1 次提交

[Windows CI] copy onnxruntime.dll to c++ test folder in windows (#44121) · 05b7ef8d

由 Sing_chan 提交于 7月 07, 2022

* copy onnxruntime.dll to c++ test folder in windows

* remove ut that failed due to onnxrumtime.dll

* test_api_impl failed of diff

* use TARGET to make sure if the test exist; use POST_BUILD to add copy command

05b7ef8d

30 6月, 2022 1 次提交
- J
  modify graph_pattern to thread_local (#43942) · 6467ca0d
  由 JingZhuangzhuang 提交于 6月 30, 2022
```
* modify graph_pattern to thread_local

* modify graph_pattern to thread_local
```
  6467ca0d

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功