提交 · 1cfcb71d980075a448981c1be1122d7032cdc39c · PaddlePaddle / Paddle

21 2月, 2023 1 次提交

[PHI Decoupling]Remove memory header (Part1) (#50419) · 1cfcb71d

由 YuanRisheng 提交于 2月 21, 2023

* decouple_memory

* perfect memory utils

* fix ci bugs

* fix inference bugs

* fix custom test bugs

* fix converage bugs

* modify code according comment

* modify namespace

* deal with compile bugs

1cfcb71d

20 2月, 2023 1 次提交
- R
  
  [PHI decoupling] remove reference to fluid/framework/tensor.h in phi (#50475) · 1c8e15c9
  由 RedContritio 提交于 2月 20, 2023
  
  1c8e15c9
17 2月, 2023 2 次提交

Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2

由 yuehuayingxueluo 提交于 2月 17, 2023

* rename multi_tensor_adam to fused_adam

* fix some bugs

* fix CI coverage

* rename test_fused_adam.py

* fix some bug

* add test_fused_adam_op.py

* fix some bugs

* fix fused_adam_op.cc

* fix CI bugs

* fix CI bug

* fix CI bug

e6af9bd2

[phi decoupling] clean TensorCopy usage in phi (#50538) · b5da73c5

由 Huang Jiyi 提交于 2月 17, 2023

* rm framework::tensor_util in phi

* clean TensoCopy

* fix bugs

* fix bugs

* fix bugs

* repalce mutable_data

* revert custom_device_test.cc

b5da73c5

16 2月, 2023 3 次提交

[dy2static-bugfix] fix backward gradient aggregation bugs (#50474) · d4c7774f

由 xiongkun 提交于 2月 16, 2023

* [dy2static-bugfix] fix backward gradient aggregation bugs
1. Yolov3 and Yolov5 all face the same problem.

* remove set_device

* code review fix

d4c7774f

[Phi decouple] move layer_norm_kernel.cu.h to phi (#50506) · 8910bb4a

由 Huang Jiyi 提交于 2月 16, 2023

* move layer_norm_kernel.cu.h to phi

* fix bugs

* fix namespace

* fix bugs

* fix CI-Windwos

* replace mutable_data

* fix bugs

* fix bugs

8910bb4a

[phi decoupling] remove variable.h in phi (#50407) · 905cefd4

由 Huang Jiyi 提交于 2月 16, 2023

* move variable_utils from phi_api_utils to fluid

* fix coment

* update include

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* update

* update

* fix CI-Windows-OpenBLAS

* fix bugs

* fix bugs

* fix bugs

* update include

* move variable_utils to phi_utils

* fix namespace

905cefd4

14 2月, 2023 2 次提交

decouple tensor_utils (#50264) · 057cdb95

由 engineer1109 提交于 2月 14, 2023

fix X

remove TensorCopy

codestyle

add fluid memory header

fix symbol

fix cmake

fix cmake

fix context

fix header

fix place

fix context

fix context

fix context

fix code

fix custom context

fix custom context

fix copy

fix data_transform

fix style

remove changes of custom

fix scalar

057cdb95

S

support int8 for embedding (#50413) · 78eb2d87
由 seemingwang 提交于 2月 14, 2023

78eb2d87

09 2月, 2023 3 次提交

[PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a

由 Huang Jiyi 提交于 2月 09, 2023

* decouple strided_memcpy

* move strided_memcpy

* move strided_memcpy to phi

* fix namespace

* update

* fix gpu compile bugs

17318c1a

H

remove layout_utils in phi (#50355) · 90650534
由 Huang Jiyi 提交于 2月 09, 2023

90650534

Add MultiTenosrAdam OP (#49220) · 10654c77

由 yuehuayingxueluo 提交于 2月 09, 2023

* add multi_tenosr_adam

* update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py

* fix adam.py optimizer.py

* fix adamw.py

* fix test_multi_tensor_adam.py

* fix CI bug

* fix CI coverage

* fix ci bug

* fix betapow

* fix some bugs

* fix test_adamw_op.py

* fix CI coverage

* fix multi_tensor_adam_kernel.cc

* fix CI bug

* fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py

* fix code style

* update C++ parts

* remove python parts modification temporarily

* add C++ ut

* update betapow copy code logic

* fix ci ut

* fix windows ci

* fix coverage ci

* improve coverage rate

---------
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

10654c77

08 2月, 2023 2 次提交
- Z
  Fix bn performance degradation (#50287) · 6f1ec935
  由 zhangkaihuo 提交于 2月 08, 2023
```
* fix bn performance degradation
```
  6f1ec935
- H
  
  move mixed_vector (#50282) · 35d7d1f0
  由 Huang Jiyi 提交于 2月 08, 2023
  
  35d7d1f0
06 2月, 2023 1 次提交
- E
  
  phi move ReshapeToMatrix & GetValue (#50139) · d09962a1
  由 engineer1109 提交于 2月 06, 2023
  
  d09962a1
03 2月, 2023 3 次提交

R
Fix 堆栈溢出 (stack overflow) of case8: paddle.unique_consecutive (#49983) · 83077f6f
由 RedContritio 提交于 2月 03, 2023
```
* support negative index in unique_consecutive

* add unittest

* add unittest
```
83077f6f

Fix 堆栈溢出 (stack overflow) of case3: paddle.metric.accuracy (#49984) · 97411214

由 RedContritio 提交于 2月 03, 2023

* add input check for accuracyOp

* add input check for gpu/accuracyOp

* add unittest

* use rank instead of dimensions in message

* update unittest

* update unittest

97411214

Generate some static graph ops (#49906) · 85490f70

由 HappyHeavyRain 提交于 2月 03, 2023

* generate some static graph ops

* fix the bug of pow

* add REGISTER_ACTIVATION_OP in operators.cmake

* modify the file operators.cmake

85490f70

02 2月, 2023 2 次提交
- C
  Several ops support zero dim on GPU and CPU (#49959) · 5db88d08
  由 Ccc 提交于 2月 02, 2023
```
* paddle.nn.functional.softmax
* paddle.nn.functional.log_softmax
* paddle.nn.functional.gumbel_softmax
* paddle.nn.functional.prelu
```
  5db88d08
- L
  
  Fix the FP16 precision problem of add_n. (#50129) · 14dd68e1
  由 liuruyan 提交于 2月 02, 2023
  
  14dd68e1
01 2月, 2023 2 次提交

[Zero-Dim] Fix 0-dim tensor for arg_min_max op. (#49570) · e4e94a88

由 Zhong Hui 提交于 2月 01, 2023

* fix 0-d tensor for arg_min_max op.

* fix xpu.

* fix zero dims

* fix

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update test_zero_dim_tensor.py

* Update test_zero_dim_tensor_xpu.py

* Update test_zero_dim_tensor.py

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

e4e94a88

[Divide by 0 Error] add norm check (#49966) · 5dfddaea

由 gouzil 提交于 2月 01, 2023

* [Divide by 0 Error] add norm check

* [Divide by 0 Error] fix x AttributeError

* [Divide by 0 Error] norm check migrate to c++

5dfddaea

31 1月, 2023 4 次提交
- Z
  
  optimize 2D sync_batch_norm (#49663) · 9a4acfee
  由 zhangkaihuo 提交于 1月 31, 2023
  
  9a4acfee
- 2
  
  support fp16 squaredl2norm (#48315) · ce4637c1
  由 201716010711 提交于 1月 30, 2023
  
  ce4637c1
- R
  
  add dims check for nms_kernel (#49993) · 4976153d
  由 RedContritio 提交于 1月 31, 2023
  
  4976153d
- Y
  Unify the gpu implementation of stack and unstack to reuse the optimization. (#49748) · 3586e856
  由 Yiqun Liu 提交于 1月 31, 2023
```
* Unify the gpu implementation of stack and unstack to reuse the optimization.

* Optimize the cuda implementation of unstack.

* Use GpuMemcpyAsync instead of memory::Copy.

* Fix error of calculating the index.

* Use FastDivMod to further imporve the performance of unstack.
```
  3586e856
30 1月, 2023 2 次提交
- R
  [Divide by 0 Error] add pinv check (#49951) · f6e874bc
  由 Ryan 提交于 1月 30, 2023
```
* add pinv check

* add unitest

* update unitest

* roll back

* fix not call stupid bug

* use context
```
  f6e874bc
- E
  add phi tensor vector array api from fluid (#49885) · 094e3b8c
  由 engineer1109 提交于 1月 30, 2023
```
replace all TensorFromVector & TensorToVector

AssignKernel async copy
```
  094e3b8c
18 1月, 2023 1 次提交

[0 Tensor support] support the 0d tensor for the cumsum (#49518) · 5fca45ea

由 wawltor 提交于 1月 18, 2023

* Add the cumsum 0d tensor

* xpu and cpu judge the 0d  tensor

* change to 2022 to 2023 in new commit

* fix the reverse logic

5fca45ea

16 1月, 2023 1 次提交

CUDA12.0 integration (#49539) · 1885d55a

由 zlsh80826 提交于 1月 16, 2023

* Update warpctc for cuda-12

* Deprecate cudaProfilerInitialize for CUDA > 11

* Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040

* Add the missing thrust header

1885d55a

13 1月, 2023 1 次提交
- Z
  
  Update threshold of bn1d (#49734) · 0294ab41
  由 zhangkaihuo 提交于 1月 13, 2023
  
  0294ab41
12 1月, 2023 2 次提交

lerp support 0 Tensor (#49667) · 8cd0d5b3

由 sunli 提交于 1月 12, 2023

* lerp support 0 Tensor

* fix lerp grad

* fix lerp zero test

* fix 0D + ND/ND + 0D

* fix check

* update code

* fix lerp infer shape

* static backward test

* updata static graph test

8cd0d5b3

Y
[PHI]Rename some PHI Kernel (#49470) · 30f5e39b
由 YuanRisheng 提交于 1月 12, 2023
```
* rename kernel

* delete sig

* modify code according comment

* fix ci bugs
```
30f5e39b

11 1月, 2023 1 次提交

Implement a common segmented array. (#49450) · b1faa562

由 Yiqun Liu 提交于 1月 11, 2023

* Implement a common PointerArray.

* Polish codes.

* Add including of header file.

* Add the branch of kFix8.

* Fix compiling error.

* Add alignas hint to fix the performance drop.

* Optimize the H2D copy in stack_grad.

* Rename the macro.

* Fix align hint for different compilers.

* Polish the define of PADDLE_ALIGN.

* Fix compiling error.

* Remove the align hint on windows.

b1faa562

10 1月, 2023 2 次提交
- L
  Optimization for StackGradCUDAKernel for last dimension stack case. (#48992) · 0cae5c7f
  由 limingshu 提交于 1月 10, 2023
```
* add stack grad kernel optimization

* add basic optimization kernel for stack_grad_kernel

* optimization of stack_grad_kernel for last dim stack and change code format with pre-commit
```
  0cae5c7f
- Refine name style and MoeKernel (#49432) · 39210ed0
  由 MarDino 提交于 1月 10, 2023
  
  39210ed0
09 1月, 2023 1 次提交
- W
  
  [0 Tensor support] cumprod (#49550) · 50a8b655
  由 wangzhen38 提交于 1月 09, 2023
  
  50a8b655
06 1月, 2023 1 次提交

[zero-dim] Support 0-d for kthvalue and mode (#49340) · 292738f3

由 JYChen 提交于 1月 06, 2023

* add 0-d support for paddle.kthvalue

* add 0-d support for paddle.mode

* fix coverage test for device

* fix check-bug in windows

* change axis check from LT to LE

* add shape & value check for grad when input is 0d tensor

292738f3

05 1月, 2023 2 次提交
- S
  Support 0D for paddle.sort/argsort (#49501) · 032da731
  由 Siming Dai 提交于 1月 05, 2023
```
* support 0D for paddle.sort/argsort

* support 0D tensor for paddle.sort/argsort in xpu

* fix bug

* fix grad and add value assertion
```
  032da731
- Z
  
  support generate static graph code for imag and real op (#49523) · 192eb4d5
  由 zyfncg 提交于 1月 05, 2023
  
  192eb4d5

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功