提交 · 30f5e39b6c6031da2116488b03d7a5e23f04a4f7 · PaddlePaddle / Paddle

12 1月, 2023 1 次提交
- Y
  [PHI]Rename some PHI Kernel (#49470) · 30f5e39b
  由 YuanRisheng 提交于 1月 12, 2023
```
* rename kernel

* delete sig

* modify code according comment

* fix ci bugs
```
  30f5e39b
11 1月, 2023 2 次提交

Implement a common segmented array. (#49450) · b1faa562

由 Yiqun Liu 提交于 1月 11, 2023

* Implement a common PointerArray.

* Polish codes.

* Add including of header file.

* Add the branch of kFix8.

* Fix compiling error.

* Add alignas hint to fix the performance drop.

* Optimize the H2D copy in stack_grad.

* Rename the macro.

* Fix align hint for different compilers.

* Polish the define of PADDLE_ALIGN.

* Fix compiling error.

* Remove the align hint on windows.

b1faa562

N

Update the style of print for low precision op list (#49648) · 395520f1
由 niuliling123 提交于 1月 11, 2023

395520f1

10 1月, 2023 6 次提交

Optimization for StackGradCUDAKernel for last dimension stack case. (#48992) · 0cae5c7f

由 limingshu 提交于 1月 10, 2023

* add stack grad kernel optimization

* add basic optimization kernel for stack_grad_kernel

* optimization of stack_grad_kernel for last dim stack and change code format with pre-commit

0cae5c7f

Use `CommContextManager` to init comm op using gloo backend (#49666) · 05df6973

由 Wen Sun 提交于 1月 10, 2023

* refactor: gloo comm context migration

* fix: headers & avoid mutable_data usage

* fix: cmake gloo dep

* style: rename funcs

* refactor: move to new files

* fix: gloo deps

* refactor: simplify create device

05df6973

[Zero-Dim] support input 0D Tensor for maximum,minimum,allclose,sigmoid_focal_loss (#49616) · 98693428

由 FlyingQianMM 提交于 1月 10, 2023

* [Zero-Dim] support input 0D Tensor for maximum,minimum,allclose,sigmoid_focal_loss

* [Zero-Dim] add backward test for sigmoid_focal_loss with 0-D input Tensor

98693428

[PHI Decoupling] move sequence_scale from fluid to phi (#49668) · a36c5490

由 Ryan 提交于 1月 10, 2023

* try sequence_padding

* fix cant use mutable_data

* fix mistake fluid_sequence_scale.hh/CMakeLists.t include

* fix namespace bug

* fix framework::ToAbsOffset not found

* fix codestyle

a36c5490

Refine name style and MoeKernel (#49432) · 39210ed0
由 MarDino 提交于 1月 10, 2023

39210ed0
Add cuda compiled arch check (#49592) · c0d6ec63
由 MarDino 提交于 1月 10, 2023

c0d6ec63

09 1月, 2023 8 次提交

Add concat optimization (#49540) · 1a0b3661

由 MarDino 提交于 1月 09, 2023

* add concat optimization

* refine

* remove annotation

* use alignas instead of aligned_storage

1a0b3661

Support the 'drop_empty_grad' in of output of backward_ops (#49588) · 36c6c589
由 HappyHeavyRain 提交于 1月 09, 2023
```
* support the drop_empty_grad in backward

* change code according to yunfei's review suggestion
```
36c6c589
Q

add fill/fill_any for kunlun (#49645) · 31ea3231
由 QingshuChen 提交于 1月 09, 2023

31ea3231

[XPU] add einsum fill diagonal and diagonal kernels (#49465) · a5bf156b

由 ykkk2333 提交于 1月 09, 2023

* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun

* fix dlrm throughput problem, test=kunlun

* add xpu einsum, fill_diagonal, and diagonal kernels, test=kunlun

a5bf156b

Prim paddle Basic (#49272) · 2f601282

由 Jiabin Yang 提交于 1月 09, 2023

* proto type of composite grad in paddle

* proto type of composite grad in paddle

* refactor composite api with phi

* fix compile error

* support static graph code-gen for squeeze op

* generate static graph code of unsqueeze

* refine op name

* fix compile error

* add extra output in op_compat

* remove debug log

* fix clang compile error

* support prim switch flag

* support prim switch flag

* fix dygraph error

* merge develop

* add code_gen

* add necessary files without codegen

* fix code_gen bug

* add deps

* modify igmnore

* add ignore

* delete std cout

* add composite logic for backward.py

* add tanh first order grad composite

* support enable_prim flag for static graph

* throw expection when both GrapOpMaker and GradCompOpMaker not been registered

* reorganize the directory of prim api tests

* fix windows error

* add eager_utils

* add eager_utils

* modify code gen

* add composite parse

* add unittest for get_grad_op_desc

* code optimize

* fix static test on windows

* support generate static graph code for imag and real op

* fix windows compile error in test_static_prim

* merge develop

* disable test eager in inference

* prim code gen

* disable eager compile in inference

* rm other file

* rm gitignore file

* code_style

* add eager test

* code_style

* merge develop

* remove useless files

* modify static test

* support bool flag from singlton

* merge develop

* recover git ignore

* fix conflict

* recover git ignore for generated op

* fix test compile error

* remove some tests

* add python test

* fix some name issue

* add composite code gen

* modify backward yaml

* fix static composite grad maker code gen

* remove addtional files

* add some static funcs unit test

* fix some bugs

* fix composite grad maker register code gen

* optimize some functions
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
Co-authored-by: Nwangruting <wangruting@baidu.com>
Co-authored-by: Ncxxly <chenxx_id@163.com>
Co-authored-by: Ncharles-hit <wanghao107@baidu.com>
Co-authored-by: Nxiaoguoguo626807 <100397923+xiaoguoguo626807@users.noreply.github.com>

2f601282

H

add C++ api (#49613) · 65d2b4af
由 HongyuJia 提交于 1月 09, 2023

65d2b4af
W

[0 Tensor support] cumprod (#49550) · 50a8b655
由 wangzhen38 提交于 1月 09, 2023

50a8b655

Create comm_context and modified static init (#49536) · 04e24e58

由 LiYuRio 提交于 1月 09, 2023

* comm_context and static init

* refactor: move to phi/core/distributed

* refactor: avoid mutable_data usage

* fix: windows sock

* fix: device without nccl
Co-authored-by: Wen Sun <syl1887415157@126.com>

04e24e58

06 1月, 2023 8 次提交
- R
  Dev (#49591) · 07db4a9f
  由 RuohengMa 提交于 1月 06, 2023
```
* add bitwise and, bitwise not, bitwise or and bitwise xor

* correct typo
```
  07db4a9f
- J
  [zero-dim] Support 0-d for kthvalue and mode (#49340) · 292738f3
  由 JYChen 提交于 1月 06, 2023
```
* add 0-d support for paddle.kthvalue

* add 0-d support for paddle.mode

* fix coverage test for device

* fix check-bug in windows

* change axis check from LT to LE

* add shape & value check for grad when input is 0d tensor
```
  292738f3
- H
  
  fix typo, compatiable->compatible, test=document_fix (#49552) · 6ec8dfdd
  由 HongyuJia 提交于 1月 06, 2023
  
  6ec8dfdd
- S
  【Zero-Dim】Support Zero dim for embedding and one-hot (#49562) · 370b50f6
  由 seemingwang 提交于 1月 06, 2023
```
* zero-tensor

* remove unused

* zero_dim_xpu

* relocate

* add value test

* fix syntax
```
  370b50f6
- J
  【Zero-Dim】Flatten support 0d tensor (#49361) · 0093aaa6
  由 jiangcheng 提交于 1月 06, 2023
```
* flatten op support 0D-tensor

* add test in zero dim py

* fix shape should be list

* short code for ci-coverage

* add backward test

* simple code for ci coverage

* add axis check

* add 0D-tensor test in test_flatten_contiguous_range_op.py

* add axis error test for Coverage CI

* add more test for CI-Coverage

* add more test for CI-Coverage
```
  0093aaa6
- T
  
  fix bug (#49546) · e0ee7403
  由 Thomas Young 提交于 1月 06, 2023
  
  e0ee7403
- 张
  
  Expansions of some unmaintained pr (#49551) · 419c2d14
  由张春乔提交于 1月 06, 2023
  
  419c2d14
- N
  
  Fix inaccurate return of low precision op list (#49391) · a214e5dc
  由 niuliling123 提交于 1月 06, 2023
  
  a214e5dc
05 1月, 2023 7 次提交
- S
  Support 0D for paddle.sort/argsort (#49501) · 032da731
  由 Siming Dai 提交于 1月 05, 2023
```
* support 0D for paddle.sort/argsort

* support 0D tensor for paddle.sort/argsort in xpu

* fix bug

* fix grad and add value assertion
```
  032da731
- X
  
  [Paddle Inference] Add ci flags for a persistent IBuilder. (#49538) · fcd6d675
  由 xiaoxiaohehe001 提交于 1月 05, 2023
  
  fcd6d675
- Generate the static graph code of ops (#49413) · 39f0eb2c
  由 HappyHeavyRain 提交于 1月 05, 2023
```
* generate the static graph code of ops

* modify the isclose comment

* modify the clip comment in nn.py

* reset nn.py
```
  39f0eb2c
- Z
  
  [BugFix] Fix illegal memory overflow for p_norm op (#49537) · ba1dce0a
  由 Zhong Hui 提交于 1月 05, 2023
  
  ba1dce0a
- Z
  
  support generate static graph code for imag and real op (#49523) · 192eb4d5
  由 zyfncg 提交于 1月 05, 2023
  
  192eb4d5
- X
  
  fix trace heap overflow (#49548) · 5feadc0b
  由 XiangGao 提交于 1月 05, 2023
  
  5feadc0b
- G
  
  Add to_hash func and paddle2arg map for cinn (#49402) · 1168a178
  由 GaoYuYang 提交于 1月 05, 2023
  
  1168a178
04 1月, 2023 5 次提交

G

Add the input check for softmax_with_cross_entropy (#49333) · f17b2de8
由 Guanghua Yu 提交于 1月 04, 2023

f17b2de8
W

[Inference] Add conv_fusion nhwc impl. (#49047) · 4a8708bb
由 Wilber 提交于 1月 04, 2023

4a8708bb
Z

refine diagonal infermeta (#49520) · 852c8db3
由 zhangbo9674 提交于 1月 04, 2023

852c8db3
Y

[Paddle Inference] fix mixed precision diff (#49475) · ac75a9a6
由 Yuanle Liu 提交于 1月 04, 2023

ac75a9a6

[Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f

由 HongyuJia 提交于 1月 04, 2023

* execute use kernel_key first

* change OpKernelType->KernelKey

* fix py3 compile error, remove redundant header files

* fix build_strategy_test

* fix DataType::RAW

* fix custom_type test: operator_test.cc

* fix transform place

* fix backends_are_same_class

* try fix place TransDataDevice

* support all KernelKey

* fix TransformData

* fix place_are_same_class

* fix merge

* fix test_params_no_grad

* fix specific place of GetExpectedKernelType

* fix specific place of GetExpectedKernelType

* fix GetKernelTypeForVar

* fix dtype error

* fix fetch_v2

* change GetKernelTypeForVar

* fix interpreter

* fix typo error

* polish codes

* polish codes

* polish codes

* fix conflict

4383494f

03 1月, 2023 3 次提交
- L
  
  H2D data transfer optimization for concat kernel (#49040) · 0de94cd9
  由 limingshu 提交于 1月 03, 2023
  
  0de94cd9
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- Y
  Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad. (#49419) · c4604025
  由 Yiqun Liu 提交于 1月 03, 2023
```
* Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad.

* Correct the axis when there is only 1 input in BroadcastKernel.

* Add the calculate of output's shape.
```
  c4604025

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功