提交 · 6d5d9f23e929f37b0ec83f522d54cc1b4196ec2d · PaddlePaddle / Paddle

04 7月, 2023 2 次提交
- H
  [XPU] Add XPU plugin support (#55101) · 6d5d9f23
  由 hong19860320 提交于 7月 04, 2023
```
* Add XPU plugin to support the customized ops or improve the performance of the fusion ops based on hand-written xpu micro kernels.

* refine README.md
```
  6d5d9f23
- R
  
  [CustomDevice] refine set_constant_with_place by calling full kernel (#55089) · 07b83f2e
  由 ronnywang 提交于 7月 04, 2023
  
  07b83f2e
03 7月, 2023 5 次提交
- J
  [XPU] Fix the topk, set_value ops that using temporary tensors avoiding the... · cc2059a0
  由 jiangfan06 提交于 7月 03, 2023
```
[XPU] Fix the topk, set_value ops that using temporary tensors avoiding the memory overlaps during multi-stream inference (#54851)
```
  cc2059a0
- R
  [CustomDevice] release device manager in py::atexit (#54932) · e5725680
  由 ronnywang 提交于 7月 03, 2023
```
* [CustomDevice] release device manager in py::atexit

* fix hip_version macro

* update

* update
```
  e5725680
- L
  【PaddlePaddle Hackathon 4】No.63 : add lerp bf16 support (#53078) · ce31a72e
  由 LoneRanger 提交于 7月 03, 2023
```
* add lerp bf16 support

* fix bug

* Update test_lerp_op.py

modify the input dtype

* modify the test_lerp_op.py

* Update test_lerp_op.py

* fix bug of import

* add user_defined_grads

* Update test_lerp_op.py

* fix bug of grad

* fix bug of grad

* fix bug of grad

* add the check for bfloat16 dtype
```
  ce31a72e
- add linear_compress API (#54140) · c4d5ec66
  由 FormlessUnit 提交于 7月 03, 2023
```
* add linear_compress API
```
  c4d5ec66
- N
  
  Update the rope op according to the comments (#54985) · 2401d48d
  由 niuliling123 提交于 7月 03, 2023
  
  2401d48d
02 7月, 2023 1 次提交
- H
  Fix fetch op and null type bug (#55027) · a20051cd
  由 hong 提交于 7月 02, 2023
```
* fix_fetch_op_and_null_type_bug

* fix compile bug

* add test case
```
  a20051cd
30 6月, 2023 1 次提交
- M
  
  [XPU] Add conv2d transpose fuse pass (#54904) · 12c15b89
  由 mjp9527 提交于 6月 30, 2023
  
  12c15b89
29 6月, 2023 3 次提交
- Y
  Fix compiling on XPU related to MPTypeTrait. (#54924) · 7353e9e9
  由 Yiqun Liu 提交于 6月 29, 2023
```
* Fix compiling on XPU related to MPTypeTrait.

* Unify the use of MPTypeTrait.

* Fix compiling error.
```
  7353e9e9
- N
  Add fused_rope forward op (#54351) · a215c46a
  由 niuliling123 提交于 6月 29, 2023
```
* style

* more

* update ctest

* Update legacy_backward.yaml

* Update legacy_ops.yaml

* Update legacy_ops.yaml

* update

* update

* update for move
```
  a215c46a
- H
  
  [XPU] fix layer_norm_grad bug when bias_grad and scale_grad are nullptr (#54669) · 55b974e7
  由 haosicheng 提交于 6月 29, 2023
  
  55b974e7
28 6月, 2023 4 次提交
- L
  [XPU][PHI Kernels] add int_with_ll quantization for conv kernels (#54827) · bd67209f
  由 lijin23 提交于 6月 28, 2023
```
* add int_with_ll to conv

* fix bugs when output_size is specified for conv2d_transpose
```
  bd67209f
- S
  [BugFix] Fix bug for binary_cross_entropy_with_logits loss (#54869) · bb42d870
  由 Siming Dai 提交于 6月 28, 2023
```
* add pos_weight in kernel

* fix unittest

* fix xpu

* fix bce unittest, change infermeta order
```
  bb42d870
- R
  [ROCM] fix cupti, rccl on rocm (#54807) · 57da105c
  由 ronnywang 提交于 6月 28, 2023
```
* [ROCM] fix cupti, hipcub

* update

* update
```
  57da105c
- Y
  
  Support 0-D Tensor for check_numerics_kernel. (#54868) · b7fbd339
  由 Yiqun Liu 提交于 6月 28, 2023
  
  b7fbd339
27 6月, 2023 2 次提交
- Z
  delete swish_raw (#54536) · 0cdaafea
  由 zhangyuqin1998 提交于 6月 27, 2023
```
* delete swish_raw

* fix

* Update activation_kernel.cc

* fix
```
  0cdaafea
- add all_to_all phi operator (#54797) · 158b7ae5
  由 TaoTao Li 提交于 6月 27, 2023
```
* add all_to_all phi operator, kernel, api

* add all_to_all ut

* tinyfix
```
  158b7ae5
26 6月, 2023 2 次提交

P

exclude xpu (#54848) · 6962d3e2
由 pangengzheng 提交于 6月 26, 2023

6962d3e2

remove ops from OpsWithFluidKernelNeedMoveToPhi set (#54007) · 733eca85

由 Sonder 提交于 6月 26, 2023

* remove ops from OpsWithFluidKernelNeedMoveToPhi set

* open static build flag

* OpsWithFluidKernelNeedMoveToPhi

* open new_executor_static_build

* add infermate for cudnn_lstm

* fix

* update

* fix

* update

* update

* update

* fix pow2 decay

* fix pow2 decay

* recover analysis_predictor.cc

* fix pow2 decay

* fix cudnn lstm

* add output register info for svd

* fix pow2_decay_with_linear_warmup_kernel

* recover test lstm cudnn

* recover svg register codes

* fix register info

* fix reduce sum register info

* add output info for adadelta

* add output info for adadelta

* add output info for adamax

* fix complex abs register info

* add register info for cudnn_lstm_grad

* recover

* fix lstm cudnn

* fix

* fix xpu output registe info

* remove std::cout

* add backend

* remove output info in pow2_decay_with_linear_warmup_kernel

* add judgment in TensorShouldBeFakeInitialized

* recover power_

* close new_executor_static_build

* fix set_value_xpu

733eca85

25 6月, 2023 2 次提交
- H
  Support fetch in new ir (#54826) · e66beb0b
  由 hong 提交于 6月 25, 2023
```
* add fetch kernel

* support fetch var in new ir

* fix bug

* polish code

* change array equal to np.testing
```
  e66beb0b
- H
  
  [XPU] fix 0-dim of SplitFunctor. (#54816) · 03d6d98c
  由 houj04 提交于 6月 25, 2023
  
  03d6d98c
21 6月, 2023 1 次提交
- L
  
  add int quantization for xpu (#54802) · e9f8baa6
  由 lijin23 提交于 6月 21, 2023
  
  e9f8baa6
20 6月, 2023 4 次提交
- W
  static graph autogen code support for matmul op (#54338) · ad80fbfe
  由 Wang Xin 提交于 6月 20, 2023
```
* static graph autogen code support for matmul op

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug
```
  ad80fbfe
- prepare for collective communicate upgrade in dygraph (#54417) · 46c57674
  由 TaoTao Li 提交于 6月 20, 2023
  
  46c57674
- Y
  
  Remove reduntant definition of MPTypeTrait. (#54756) · f469f176
  由 Yiqun Liu 提交于 6月 20, 2023
  
  f469f176
- L
  [XPU][PHI Kernels] add unique kernel for xpu (#54758) · f836e7d2
  由 lijin23 提交于 6月 20, 2023
```
* add unique kernel for xpu

* add unique kernel for xpu

* update uniittest

* add xpu support for unique with axis
```
  f836e7d2
19 6月, 2023 2 次提交
- L
  
  Fix incorrect size of grid dimension in index_select (#54660) · 20bf9592
  由 Leo Chen 提交于 6月 19, 2023
  
  20bf9592
- H
  
  [XPU] fix gather_nd op when index's numel is 0. (#54714) · 3355c0c0
  由 houj04 提交于 6月 19, 2023
  
  3355c0c0
16 6月, 2023 3 次提交
- C
  
  fix batch_norm grad kernel nhwc error (#54703) · 4c6f77d8
  由 cyber-pioneer 提交于 6月 16, 2023
  
  4c6f77d8
- Z
  fix lamb optimizer always_adapt (#54654) · 2a56f4b3
  由 zhiboniu 提交于 6月 16, 2023
```
* fix lamb always_adapt

* fix optest

* fix all optests
```
  2a56f4b3
- C
  
  fix batch_norm cuda grad kernel test mode bug (#54681) · eb9d07e5
  由 cyber-pioneer 提交于 6月 16, 2023
  
  eb9d07e5
15 6月, 2023 1 次提交

exp/expm1 support int32/int64/float16 forward (#54556) · 58ae8c7c

由 Hui Zhang 提交于 6月 15, 2023

* fix for log xxx

* add int32/int64 for cpu/gpu; add float16/bfloat16 for cpu forward

* fix docstring

* fix bug

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bug

* using cast

* fix test

* fix api

* fix other bugs

* fix ci bug for not using dygraph guard

* add bfloat16 test

* fix ut

* bf16

* exp/expm1 support int32/int64

* fix ut

* fix ut

* fix ut

58ae8c7c

14 6月, 2023 4 次提交
- C
  
  fix mea get pad no default return bug (#54644) · c037453d
  由 Chitsing KUI 提交于 6月 14, 2023
  
  c037453d
- C
  
  support group_norm and cumsum prim ops bf16 dtype (#54580) · f7eb03c6
  由 Charles-hit 提交于 6月 14, 2023
  
  f7eb03c6
- [Zero-Dim] paddle.nanmedian/nanquantile support 0D Tensor (#54500) · 3d4d995f
  由 zhouweiwei2014 提交于 6月 14, 2023
```
* [Zero-Dim] paddle.nanmedian support 0D Tensor

* fix CI
```
  3d4d995f
- S
  Fix A100 CUDA12 ut (#54487) · a96c6dc7
  由 sneaxiy 提交于 6月 14, 2023
```
* fix A100 CUDA12 ut

* fix ci uts

* fix test_sync_batch_norm_op

* fix sync bn op ut again by separating 2 files

* fix codestyle ci

* combine other PRs

* fix codestyle

* fix codestyle ci
```
  a96c6dc7
13 6月, 2023 1 次提交
- N
  
  【Hackathon 4 No.17】Add cummax / cummin API to Paddle (#53546) · 3a3fb1fe
  由 NetPunk 提交于 6月 13, 2023
  
  3a3fb1fe
12 6月, 2023 2 次提交

log/Log10/log2/log1p support int32/int64/float16/bfloat16 forward (#54089) · 2ddd0473

由 Hui Zhang 提交于 6月 12, 2023

* fix for log xxx

* add int32/int64 for cpu/gpu; add float16/bfloat16 for cpu forward

* fix docstring

* fix bug

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bug

* using cast

* fix test

* fix api

* fix other bugs

* fix ci bug for not using dygraph guard

* add bfloat16 test

* fix ut

* bf16

2ddd0473

Z
[inference]conv_fusion support bias's rank equal to input's rank (#54477) · 03dbdbd1
由 Zhang Jun 提交于 6月 12, 2023
```
* support bias's rank equal to input's rank
```
03dbdbd1

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功