提交 · 0b79129d5a76eeb7e4cf004ad1d43347f58520da · PaddlePaddle / Paddle

19 12月, 2022 7 次提交
- H
  [PHI decoupling] move gather_scatter_kernel from fluid to phi (#49132) · 0b79129d
  由 huangjiyi 提交于 12月 19, 2022
```
* move gather_scatter_kernel from fluid to phi

* mv gather_scatter_kernel to gather_scatter_functor
```
  0b79129d
- H
  
  simplify FallbackToCpu (#49124) · 7ffde4bc
  由 HongyuJia 提交于 12月 19, 2022
  
  7ffde4bc
- H
  
  remove expected_kernel_key in interpreter (#49120) · ff79c144
  由 HongyuJia 提交于 12月 19, 2022
  
  ff79c144
- W
  
  refactor: rename process group (#49137) · 22e416cf
  由 Wen Sun 提交于 12月 19, 2022
  
  22e416cf
- W
  [Paddle Inference]restart looup_table_v2 (#49119) · ffc39605
  由 Wangzheee 提交于 12月 19, 2022
```
* restart looup_table_v2
```
  ffc39605
- W
  [Paddle Inference] General optimization for no_varlen skiplayernorm (#49039) · b50dbe0b
  由 Wangzheee 提交于 12月 19, 2022
```
* General optimization for no_varlen embedding layernorm
```
  b50dbe0b
- H
  [PHI Decoupling] move maxouting and matrix_bit_code from fluid to phi (#49131) · 5e222dc2
  由 huangjiyi 提交于 12月 19, 2022
```
* move maxouting from fluid to phi

* move matrix_bit_code from fluid to phi

* replace mutable_data and fix include

* fix include

* move gather_scatter_kernel from fluid to phi

* Revert "move gather_scatter_kernel from fluid to phi"

This reverts commit 3d0b1eaf179656072e8c483dfca688cccccdda01.
```
  5e222dc2
17 12月, 2022 2 次提交
- W
  
  refactor: rename xccl files (#49127) · d4f43ad4
  由 Wen Sun 提交于 12月 17, 2022
  
  d4f43ad4
- X
  
  [Paddle Inference] Memory Optimize destruct argument (#49046) · 0b36655b
  由 xiaoxiaohehe001 提交于 12月 17, 2022
  
  0b36655b
16 12月, 2022 4 次提交
- W
  
  refactor: rename files (#49117) · 40f3f4f0
  由 Wen Sun 提交于 12月 16, 2022
  
  40f3f4f0
- H
  change staticRNN to while (#48213) · 69536892
  由 hong 提交于 12月 16, 2022
```
* change staticRNN to while

* update code

* fix rnn bug

* update

* fix _find_op_path_ bugs in append_backward.

* polish code

* revert op proto

* update

* udpate while

* format

* revert test while loop op

* fix create array

* fix windows error

* fix bug

* update

* fix array write bug
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
```
  69536892
- Y
  
  add PADDLE_WITH_TENSORRT to EnableTensorRtEngine (#49091) · c4f30c51
  由 Yuanle Liu 提交于 12月 16, 2022
  
  c4f30c51
- R
  
  Set once_flag to thread_local for StreamSafeCUDAAllcation (#49104) · 73ec9b78
  由 Ruibiao Chen 提交于 12月 16, 2022
  
  73ec9b78
15 12月, 2022 9 次提交
- Z
  Add validity check for config in yaml (#49049) · d6b062e8
  由 zyfncg 提交于 12月 15, 2022
```
* add validity check for config in yaml

* delete debug log
```
  d6b062e8
- Z
  [inference] move IsFloatVar() from tensorrt/ to api/ (#49070) · 2190ea09
  由 Zhang Jun 提交于 12月 15, 2022
```
* move IsFloatVar() from tensorrt/ to api/
```
  2190ea09
- H
  
  [PHI decoupling] move softmax from fluid to phi and remove cpu_vec.h in fluid (#48970) · 344b99e1
  由 huangjiyi 提交于 12月 15, 2022
  
  344b99e1
- Z
  
  Add a persistent ibuilder to speedup unit test (#48906) · 9bd20aa0
  由 zlsh80826 提交于 12月 15, 2022
  
  9bd20aa0
- S
  [PHI decoupling] Remove fluid imports from MKLDNN code (#48981) · 4d5a5533
  由 Sławomir Siwek 提交于 12月 15, 2022
```
* fix wrong handler name

* mkldnn_engine -> onednn_engine

* remove fluid/errors.h imports

* remove fluid/enforce.h imports

* remove note and unnecessary import

* remove fluid/pretty_log.h imports

* remove fluid/place.h imports

* remove fluid/data_layout_transform.h imports

* remove fluid/device_context.h imports

* remove mkldnn_helper code

* remove fluid/mkldnn_reuse.h imports

* pretty_log import
```
  4d5a5533
- R
  
  SetDeviceId in StreamSafeCUDAAllocation (#49080) · 32633c8e
  由 Ruibiao Chen 提交于 12月 15, 2022
  
  32633c8e
- W
  
  fix embedding multihead (#49085) · 439b2b94
  由 Wangzheee 提交于 12月 15, 2022
  
  439b2b94
- W
  [Inference] memory_optimize and mkdlnn problem (#49054) · 04dd2861
  由 Wilber 提交于 12月 15, 2022
```
* memory_optimize and mkdlnn problem

* update

* update

* update
```
  04dd2861
- W
  
  fix: gloo compatible (#49084) · 3fec7a6e
  由 Wen Sun 提交于 12月 15, 2022
  
  3fec7a6e
14 12月, 2022 8 次提交
- M
  
  Fix nullptr to TestFuseGemmEpilogueReluBWDFP* (#48997) · e61df289
  由 Ming-Xu Huang 提交于 12月 14, 2022
  
  e61df289
- Y
  
  [Paddle Inference] rewrite convert_to_mixed_precision (#48853) · 28ea9aad
  由 Yuanle Liu 提交于 12月 14, 2022
  
  28ea9aad
- L
  Divide elementwise case from BroadcastKernel and refine transpose autotune (#33051) · 6c9df13d
  由 limingshu 提交于 12月 14, 2022
```
* First Commit.

* add some codes

* add elementwise loader

* fix code styles

* merge with develop

* add some changes both in elementwise and transpose

* add init operation in broadcast kernel.

* change codes according to pr suggestions about transpose file

* fix error for op-benchmark ci

* fix according to ci
```
  6c9df13d
- J
  nullptr bugfix for XPU pg mode (#49043) · f0dab193
  由 james 提交于 12月 14, 2022
```
* nullptr bugfix for XPU pg mode

Also a few kernels is added to xpu whitelist

* increase error msg length
```
  f0dab193
- Z
  modify cmake file for cuda11.8 compile (#49020) · d0284f85
  由 zqw_1997 提交于 12月 14, 2022
```
* modify cmake file for cuda11.8 compile

* add op_library(fused_embedding_eltwise_layernorm_op DEPS bert_encoder_functor)
```
  d0284f85
- H
  Deleted mkldnn_inplace_pass code (#47818) · 3cfb2e1a
  由 Hulek 提交于 12月 14, 2022
```
* Deleted mkldnn_inplace_pass code

* Fixed error with cmake

* Resolve conflicts
```
  3cfb2e1a
- Z
  [inference][trt] add more unary op and square (#48534) · e6cabea1
  由 Zhang Jun 提交于 12月 14, 2022
```
* add more unary op and square
```
  e6cabea1
- Y
  
  Change mutable_data to ctx.Alloc. (#49001) · ceba70c3
  由 Yiqun Liu 提交于 12月 14, 2022
  
  ceba70c3
13 12月, 2022 6 次提交
- J
  
  Correct multiple inputs and outputs (#48872) · 0ffba1c9
  由 joanna.wozna.intel 提交于 12月 13, 2022
  
  0ffba1c9
- S
  Save fused_attention op memory when dropout_rate = 0.0 (#48902) · 428fb804
  由 sneaxiy 提交于 12月 13, 2022
```
* save fused_attention memory when dropout_rate = 0.0

* add ut

* fix ut bug

* fix fused_layernorm_residual_dropout_bias_test.cu
```
  428fb804
- Generate static graph code of some ops by yaml (#48977) · 015db48e
  由 HappyHeavyRain 提交于 12月 13, 2022
```
* generate static graph code of some ops by yaml

* fix the code-style of yaml

* fix the framework_ci for triangular_solve

* change the 'data_type' of scatter

* add the 'out: Out' of scatter_nd_add
```
  015db48e
- E
  
  enable custom device save model on device memory && fix conflict (#48221) · b6aa9f53
  由 engineer1109 提交于 12月 13, 2022
  
  b6aa9f53
- W
  
  Enable Generic-Plugin support FP16 (#48807) · 5d49e3e9
  由 weishengying 提交于 12月 13, 2022
  
  5d49e3e9
- W
  [Paddle Inference]fix some transformer unitest (#48929) · cb7f736f
  由 Wangzheee 提交于 12月 13, 2022
```
* fix some transformer unitest
```
  cb7f736f
12 12月, 2022 4 次提交

W
Revert "set free_when_no_cache_hit default value to true (#48815)" (#48968) · 0db36aca
由 wanghuancoder 提交于 12月 12, 2022
```
This reverts commit 592ed40b.
```
0db36aca
R
update fused_multi_transformer_encoder_pass support GPT new matmul API (#48953) · 8d4450db
由 RichardWooSJTU 提交于 12月 12, 2022
```
* fit paddle.matmul in fleetx.gpt
```
8d4450db

[PHI]Add new Tensor type and migrate save_combine kernel (#47856) · ecf892f0

由 YuanRisheng 提交于 12月 12, 2022

* add new tensor

* fix windows compile bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* perfect according comment

* fix ci compile bugs

* add raw tensor

* fix ci bugs

* modify code by comment

* delete String

ecf892f0

傅

Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3

由傅剑寒提交于 12月 12, 2022

* fix codestyle

* add double complex<float> complex<double> dtype support for syevj_batched

* fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case

* optimize eigh in different case

* fix missing ; bug

* fix use_syevj bug

* fix use_cusolver_syevj_batched flag

16e364d3

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功