提交 · 47f87ad3eac25698419f8012ffb5f15304f71cb2 · PaddlePaddle / Paddle

24 3月, 2023 6 次提交

add phi operator allreduce/reduce (#51857) · 47f87ad3

由 TaoTao Li 提交于 3月 24, 2023

* add all_reduce, reduce kernel and api

* fix all_reduce reduce ut

fix reduce op maker conflict

fix merge conflicts

* fix conflicts, rename ReduceOp->ReduceBaseOp in reduce_ops

rename allreduce op, to remove

* fix code format

fix comments

* modify test_collective_reduce_api ut timeout

* fix PR-CI-Build

fix comments: format phi operator

47f87ad3

W
Del old dygraph optest5 (#51686) · 6261076c
由 wanghuancoder 提交于 3月 24, 2023
```
* delete old dygraph op test
```
6261076c
W
Del old dygraph MLU NPU (#51958) · 611f7ccc
由 wanghuancoder 提交于 3月 24, 2023
```
* delete old dygraph, mlu npu do not use dygraph
```
611f7ccc

Memory Efficient Attention (#51867) · e5ad3859

由 ZhangDY-6483 提交于 3月 24, 2023

* first version, notest

* return final rst, notest

* use infinity() instead of max

* ut structure

* start up of ut

* generate lse

* update

* add depense

* reconstruct cmake

* move file

* add memory efficient attention and fix blasimpl

* update

* update cmake

* add namespace

* update cmake

* use .cu

* update for pad3d

* bug fix

* bug fix

* update

* bug fix

* update enforce

* add test case

* merge the lse pad

* fix kernel_fn of backward

* fix PADDLE_ENFORCE_EQ and phi_api

* fix PADDLE_ENFORCE

* fix PADDLE_ENFORCE

* rerun coverage

* fix memory efficient attention test

* rerun ci

* add cuda version condition

* add cuda version condition

* delete WIP test

* replace PADDLE_ENFORCE

* edit the namespace of datatype in multiple.cc

* rerun

* rerun

---------
Co-authored-by: Nliuyuang <liuyuang@baidu.com>

e5ad3859

W
do not test dygraph in dygraph (#52027) · 298a1a0b
由 wanghuancoder 提交于 3月 24, 2023
```
* xpu do not test dygraph in dygraph
```
298a1a0b
Y

Fix roll kernel gpu bug. (#52012) · b6d0dac9
由 Yuang Liu 提交于 3月 24, 2023

b6d0dac9

23 3月, 2023 17 次提交
- W
  
  add paddle-trt convert op: greater_equal (#52000) · 4dfbdb04
  由 Wangzheee 提交于 3月 23, 2023
  
  4dfbdb04
- X
  【prim】delete high order prim flag && add special prune rules for node.cc (#51676) · 978d544b
  由 xiaoguoguo626807 提交于 3月 23, 2023
```
* delete prim flag for matmul_2_grad

* delete prim flag for matmul_2_grad

* add new setgradoutmeta for matmul_double_grad_node

* modify test and delete log

* deal with review
```
  978d544b
- C
  [Prim] add meshgrid composite rule (#51061) · 53bb883d
  由 chenjian 提交于 3月 23, 2023
```
* add meshgrid composite rule

* add meshgrid composite rule

* update

* add into CMakeLists

* fix

* update

* update

* optimize code

* fix meshgrid op

* update test
```
  53bb883d
- W
  delete old dygraph xpu op test (#51955) · f8a8dd5e
  由 wanghuancoder 提交于 3月 23, 2023
```
* delete old dygraph xpu op test
```
  f8a8dd5e
- H
  register fluid kerenls to phi (#51976) · cc9bbd5b
  由 Huang Jiyi 提交于 3月 23, 2023
```
* unify add_position_encoding

* unify affine_channel

* unify alloc_float_status

* unify allreduce

* unify alltoall

* unify anchor_generator

* unify ascend_trigger

* fix bug

* fix test
```
  cc9bbd5b
- H
  register fluid activation kernel to phi (#51927) · aaa14780
  由 Huang Jiyi 提交于 3月 23, 2023
```
* update

* update

* update

* update

* update

* fix test
```
  aaa14780
- C
  
  [prim] add gelu vjp rule · 2add31f4
  由 cxxly 提交于 3月 06, 2023
  
  2add31f4
- C
  [Auto Parallel] Update rule based tuner (#51908) · 325fdf1d
  由 caozhou 提交于 3月 23, 2023
```
* add patterns

* update rule based tuner

* add forward sub program completion

* add unittest

* add bwd sub program completion
```
  325fdf1d
- L
  [AMP] Add bfloat16 Support for `elementwise_pow` Op (#51888) · 288ad844
  由 Lin Manhui 提交于 3月 23, 2023
```
* Add bf16 support for elementwise_pow

* Update ut
```
  288ad844
- Y
  
  gather and gather nd fp16, bf16 support and add ut (#51903) · 5bcdfbb0
  由 Yuang Liu 提交于 3月 23, 2023
  
  5bcdfbb0
- Y
  [AMP] Add bfloat16 and float16 tests for compare ops (#51978) · a7397e0c
  由 yeliang2258 提交于 3月 23, 2023
```
* add bf16 and fp16 tests

* fix dtype check
```
  a7397e0c
- L
  【PaddlePaddle Hackathon 4】No.63 fix temporal_shift and conj (#51532) · 1550348e
  由 LoneRanger 提交于 3月 23, 2023
```
* add fp16 and bfp16 for temporalshift

* add fp16 and bfp16 for complex

* fix bug

* fix bug

* add fp16 and bf16 for conj

* fix bug

* fix bug

* Update complex_kernel.h

fix bug

* Update temporal_shift_grad_kernel.h

fix bug

* Update temporal_shift_kernel.h

fix bug
```
  1550348e
- I
  
  [CodeStyle][C403] Unnecessary list comprehension (rewrite as a set comprehension) (#51968) · ca7394cd
  由 Infinity_lee 提交于 3月 23, 2023
  
  ca7394cd
- P
  [CodeStyle][C408][C409][C410] Fix unnecessary <dict/list/tuple> call and... · cf391b81
  由 PuQing 提交于 3月 23, 2023
```
[CodeStyle][C408][C409][C410] Fix unnecessary <dict/list/tuple> call and unnecessary <list/tuple> passed to <list/tupule>() (#51928)

* autofix

* add select config

* autofix C410

* add C410 select
```
  cf391b81
- D
  【Hackathon No.45】为 Paddle logical 算子实现 float16 数据类型支持 (#50926) · 0480ff5d
  由 denglianbin 提交于 3月 23, 2023
```
* finish pr

* skip cpu test for logical

* change test style

* fix error.
```
  0480ff5d
- I
  
  [CodeStyle][C404] Unnecessary list comprehension (rewrite as a dict comprehension) (#51969) · 1f8e6ad6
  由 Infinity_lee 提交于 3月 23, 2023
  
  1f8e6ad6
- 张
  
  [CodeStyle][UP012] Unnecessary call to encode as UTF-8 (#51994) · 9796980c
  由张春乔提交于 3月 23, 2023
  
  9796980c
22 3月, 2023 17 次提交

[Zero-Dim] Support 0-D tensor for some oneDNN unary kernels (#51687) · 2a3d75bc

由 YangQun 提交于 3月 22, 2023

* support 0-d tensor for element wise unary ops

* fix python code style check

* fix approval check

* support 0-d tensor for onednn softmax and logsoftmax kernels

* fix commnets

* fix some unittests

2a3d75bc

S

add fused dropout add (#51752) · 6ba0507d
由 ShenLiang 提交于 3月 22, 2023

6ba0507d

Add fused_feed_forward pass (#50423) · 5dda0ef6

由 Ghost Screaming 提交于 3月 22, 2023

* Add fused_feed_forward pass for semi-automatic static graph training.

* Add fused_feedforward property in parallel_executor.cc

* Polish code.

* Polish fused feed_forward pass code. Support use_dropout1 and
use_dropout2 option.

* Support model parallel in fused_feedforward pass.

5dda0ef6

Extract fused_transpose op dedicated for oneDNN fuse passes (#50021) · 02296977

由 Sławomir Siwek 提交于 3月 22, 2023

* extract common methods to reuse

* add header for transpose ops

* fused_transpose

* Split big function

* transpose2 tests

* fused_transpose

* Apply extra attributes

* add pbtxt file

* update pbtxt

* Merge develop

* add more strict op compats

* code  style

* remove mkldnn_data_type

* unify SetOutMemDescWithReshape2FuseSupport

* adjust quantize-dequantize for transpose

* remove appendact

* transpose2 quantization

* fix int8 tests

* adjust transpose_op to current develop

* delete fusion code from transpose_kernel

* add fused transpose to NHWC unittest

* change order

02296977

B
【AMP OP&Test】unit test for test_logit_op (#51051) · 289677e2
由 Bo Zhang 提交于 3月 22, 2023
```
* test_logit_op

* add cudaKernel to replace eigen impl

* bf16 unit test CI
```
289677e2
H

[XPU] fix unit test of test_pad3d_op_xpu. (#51962) · de2166c0
由 houj04 提交于 3月 22, 2023

de2166c0
Z
[AMP OP&Test] Fix fp16 check_grad when user_defined_grads is not None (#51959) · 153351e1
由 Zhang Zheng 提交于 3月 22, 2023
```
* [AMP OP&Test] Fix fp16 check_grad when user_defined_grads are not None

* fix cond
```
153351e1
L
remove net_drawer.py, memory_analysis.py (#51869) · af2fa429
由 LoneRanger 提交于 3月 22, 2023
```
* remove net_drawer.py

* remove memory_analysis.py

* remove test_memory_analysis.py
```
af2fa429
K
[BugFix] fix raw_program_optimizer not apply when using amp (#51865) · 202c06a2
由 kangguangli 提交于 3月 22, 2023
```
* fix raw_program_optimizer not apply when using amp

* fix CI
```
202c06a2
W
Add reduce_max_grad composite rule (#51653) · d04c9cda
由 wangxiaoning 提交于 3月 22, 2023
```
* max comp

* fix

* add test

* fix

* fix

* fix

* fix

* fix test

* fix api
```
d04c9cda

Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b

由 sneaxiy 提交于 3月 22, 2023

* add fused_linear_param_grad_add_kernel

* fix compile error

* remove flag

* fix ci compile error

* fix ci compile error

* revert pylayer revision

* fix ci ut

* improve performance

f59c5d8b

Y

inference support double data type (#51786) · a765eb26
由 Yuanle Liu 提交于 3月 22, 2023

a765eb26
[CodeStyle][UP018] Unnecessary call to `str` (#51922) · 52a31b87
由 iSerendipity 提交于 3月 22, 2023

52a31b87

【Eager】Allow return none when stop_gradient=False (#51740) · db599258

由 Jiabin Yang 提交于 3月 22, 2023

* allow return none when stop_gradient=True

* remove useless code

* refine code

* refine code

* fix test cast

* change more test

* add more tests

db599258

【AMP OP&Test】unit test for accuracy_op (#51009) · 8c61a95a

由 Bo Zhang 提交于 3月 22, 2023

* test_accuracy_op

* add create_test_fp/bf16_class

* cast after calculation

* change convert_uint16_to_float_ifneed

* delete TestAccuracyOpFp32 according to PR comment

* fix the rtol setting rules in bfloat16 forward

8c61a95a

Z

[AMP OP&Test] Fix the rtol setting rules in bfloat16 forward (#51875) · f29c0ca1
由 Zhang Zheng 提交于 3月 22, 2023

f29c0ca1
Z

Replace OpTest.assertTrue(numpy.allclose) to numpy.testing.assert_allclose (#51690) · 75fb2ed9
由 Zhang Zheng 提交于 3月 22, 2023

75fb2ed9

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功