提交 · 98baedafded479d2d24fe6e53f4193b67cd5dab2 · PaddlePaddle / Paddle

07 9月, 2023 7 次提交
- C
  
  [clang-tidy] NO.25 modernize-deprecated-headers(#56994) · 98baedaf
  由 cyberslack_lee 提交于 9月 07, 2023
  
  98baedaf
- W
  refine stride flag (#57005) · 8c75039c
  由 wanghuancoder 提交于 9月 07, 2023
```
* refine stride flag
```
  8c75039c
- R
  
  【complex op】No.18 add complex support for silu (#57058) · d78cbee7
  由 Ruibin Cheung 提交于 9月 07, 2023
  
  d78cbee7
- R
  
  [complex] add complex support for silu (#56903) · 1d4e938d
  由 Ruibin Cheung 提交于 9月 07, 2023
  
  1d4e938d
- Y
  【Complex OP】No.28 LogSigmoid (#56852) · f6463eaa
  由 yangguohao 提交于 9月 07, 2023
```
* complex op logsigmoid

* fix 2023-08-31
```
  f6463eaa
- R
  
  [CustomDevice] Allow registration of not ready kernels (#57038) · 12301bc5
  由 ronnywang 提交于 9月 07, 2023
  
  12301bc5
- Z
  [NewIR] Update send recv infermeta and add unittest (#56794) · 2857fdbb
  由 zhaoyingli 提交于 9月 07, 2023
```
* [NewIR]Update send recv infermeta and add unittest

* rm new ir flag

* rm fluid api

* skip runing startup prog

* update flag name

* update recv_v2 yaml

* fix conflict

* unittest only for pp

* fix cmakelist

* unittest check precision

* control random

* fix cmakelist
```
  2857fdbb
06 9月, 2023 7 次提交

[AutoParallel] Generate spmd rule and reshard impl in phi api (#56831) · e9364a38

由 Chen Weihang 提交于 9月 06, 2023

* add spmd and reshard code gen

* add backward reshard code gen

* test matmul forward success

* polish test impl

* add unsafe mutable value

* polish details and add test

* fix unittest time out

* fix typo

* refactor reshard input generate impl

* resolve conflict with develop

* fix compile error

e9364a38

L
[XPU] add squeeze_excitation_block_xpu op&pass to optimize ppocr_v3_det model (#56773) · 7c8c9b7d
由 leolishaohao 提交于 9月 06, 2023
```
* [XPU] add squeeze_excitation_block_xpu op&pass to optimize ppocr_v3_det model test=kunlun

* fix

* fix Codestype

* remove xpu name
```
7c8c9b7d
Z
[IR] Refine the Build interface of split op (#56924) · ada16f94
由 zhangbo9674 提交于 9月 06, 2023
```
* fix bug

* fix bug
```
ada16f94

Auto codegen for supporting calling new_ir api in static operants (#56955) · 3eafa1fc

由 Xianduo Li 提交于 9月 06, 2023

* support new ir primitive operator in static operants

* support more vjp code gen

* support more vjp code gen

* support more vjp code gen

* use code gen

* fix operants codegen

* support more vjp code gen

* Fix ci build error

* set FLAGS_tensor_operants_mode to static in generated_vjp for testing

* fix bugs

* change the order of ops_name of divide_grad

* replace FLAGS_enable_new_ir_in_executor by FLAGS_enable_new_ir_api in codegen and test_vjp_prim

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>
Co-authored-by: N0x45f <wangzhen45@baidu.com>

3eafa1fc

Z

[ONEDNN] fix accuracy issue of fc when the input shapes are dynamic · c62902ee
由 zhanglirong1999 提交于 9月 06, 2023

c62902ee
Z
[IR] Add IrMetaTensor (#56973) · 75c4a24c
由 zhangbo9674 提交于 9月 06, 2023
```
* add meta tensor

* refine code

* fix bug

* fix bug
```
75c4a24c
H

refine bilinear interp grad register (#56976) · e2b05dcc
由 hong 提交于 9月 06, 2023

e2b05dcc

05 9月, 2023 7 次提交

G
[Fluid] move lars_momentum_op InferShape to phi (#56749) · 52a0a677
由 gouzil 提交于 9月 05, 2023
```
* move to phi

* fix

* fix type
```
52a0a677
W

add informata for strided grad kernel (#56947) · 89b91021
由 wanghuancoder 提交于 9月 05, 2023

89b91021

[Auto Parallel]: Support std::vector<phi::Tensor> input and output for DistTensor. (#56602) · d2fedeac

由 Ghost Screaming 提交于 9月 05, 2023

* [WIP] Support std::vector<phi::Tensor> input and output for DistTensor.
Concat forward and backward are verified.

* Polish code for new dist tensor implementation.

* Fix bug of DistTensor upgrade. Add support functions for std::vector<Tensor> -> std::vector<Tensor>.

* Add support for DistTensor type of std::vector<phi::Tensor> as input or output of operators.
Following testcases are passed.
1. concat: std::vector<phi::Tensor> -> phi::Tensor
2. unbind: phi::Tensor -> std::vector<phi::Tensor>
3. broadcast_tensors: std::vector<phi::Tensor> -> std::vector<phi::Tensor>

* Polish code. Remove useless comments.

* Add update_loss_scaling in skip_op_lists.

* Polish code.

d2fedeac

G
[clang-tidy] NO.8 enable `cppcoreguidelines-narrowing-conversions`. step:2 (#56895) · c2f0e9c4
由 gouzil 提交于 9月 05, 2023
```
* [clang-tidy] replenish cppcoreguidelines-narrowing-conversions

* fix

* fix
```
c2f0e9c4
G
[Fluid] move lars_momentum_xpu to phi (#56751) · 54b247b1
由 gouzil 提交于 9月 05, 2023
```
* [Fluid] move lars_momentum_xpu to phi

* Empty-Commit;test=kunlun;
```
54b247b1
J

[XPU] Add element_mul_add_fuse_pass and elementwise_madd_xpu kernel (#56629) · 5efaaaa3
由 jiangfan06 提交于 9月 05, 2023

5efaaaa3
X
[clang-tidy] No. 57,58 cppcoreguidelines-explicit-virtual-functions... · 6dd9a024
由 xiaoye 提交于 9月 05, 2023
```
[clang-tidy] No. 57,58 cppcoreguidelines-explicit-virtual-functions clang-analyzer-core.NonNullParamChecker (#56649)
```
6dd9a024

04 9月, 2023 9 次提交
- T
  Add rotate_half implementation for fused_rope (#56401) · c089a2af
  由 tianhaodongbd 提交于 9月 04, 2023
```
* add rotate_half in fused_rope

* add position_ids in fused_rope

* modified examples about fused_rope

* add set_device in examples
```
  c089a2af
- Y
  
  multihead_matmul op support codegen and kernel remove to phi (#56846) · 79bfb184
  由 Yuanle Liu 提交于 9月 04, 2023
  
  79bfb184
- N
  add num_splist to support deterministic for flash_attn_bwd and FlashAttnUnpaddedGradKernel (#56363) · 7fd6ffb8
  由 niuliling123 提交于 9月 04, 2023
```
* add num_splist for flash_attn_bwd and FlashAttnUnpaddedGradKernel

* Add assertTrue

* Update submodule to a specific commit
```
  7fd6ffb8
- W
  disable strided split (#56882) · eddf6d05
  由 wanghuancoder 提交于 9月 04, 2023
```
* disable strided split
```
  eddf6d05
- Z
  [NewIR]support c_allreduce_sum/c_identity/c_embedding/c_embedding_grad (#56836) · 0e74bf36
  由 zhaoyingli 提交于 9月 04, 2023
```
* [NewIR]add c_allreduce_sum/c_identity/c_reduce_sum/c_embedding/c_embedding_grad

* rm VLOG

* rm c_identity from LegacyOpList

* rm VLOG

* rm c_reduce_sum
```
  0e74bf36
- H
  fix compile errors when using shared phi on windows (#56915) · 8aa1772c
  由 huangjiyi 提交于 9月 04, 2023
```
* update

* fix bug

* fix bug

* fix bug

* fix bug

* rerun ci

* turn off shared_phi
```
  8aa1772c
- H
  fix paddle namespace conflict when using paddle_flags (#56913) · 7d8402a8
  由 huangjiyi 提交于 9月 04, 2023
```
* update

* update

* update
```
  7d8402a8
- D
  
  optimize softmax_mask_fuse (#56877) · 25a0b46d
  由 duanyanhui 提交于 9月 04, 2023
  
  25a0b46d
- L
  
  reshard r to p (#56833) · a28e6f63
  由 LiYuRio 提交于 9月 04, 2023
  
  a28e6f63
01 9月, 2023 7 次提交

H
export flags defined in phi on windows (#56848) · 17003369
由 huangjiyi 提交于 9月 01, 2023
```
* update

* update
```
17003369

【Complex op】add complex support for index_select and index_sample (#56457) · 0b608393

由 Scotty 提交于 9月 01, 2023

* support index_select op

* index_sample in cpu

* support index_sample in gpu

* change data_transform

* fix api gen and use skip_transform in yaml

0b608393

[NewIR]Part-2.1 Refactor NewIRCompiler to support Group Ops (#56762) · 7adb4703

由 Aurelius84 提交于 9月 01, 2023

* [NewIR]Part-2.1 Refactor NewIRCompiler to support Group Ops

* fix gflags link error

* fix include ir_printer.h

* fix unittest

* fix conflict

* fix flags

* fix comment

7adb4703

G

[clang-tidy] enable bugprone-incorrect-roundings check (#56747) · e8a96347
由 gouzil 提交于 9月 01, 2023

e8a96347

[clang-tidy] No.34,36 enable... · 17e4be21

由 cyberslack_lee 提交于 9月 01, 2023

[clang-tidy] No.34,36 enable performance-noexcept-move-constructor,modernize-use-transparent-functors (#56261)

* fix

* fix

* CI

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* CI

* fix

* CI

17e4be21

[IR] Generate pd_op.parsed.yaml from pd_op.yaml (#56674) · 962f67d2

由 chen2016013 提交于 9月 01, 2023

* Generate pd_op.parsed.yaml from pd_op.yaml

* Generate pd_op.parsed.yaml from pd_op.yaml

* fix bug

* bug fix

* bug fix

* bug fix

* 向pd_ops.yaml中新增算子 & 修改pd_ops.parsed.yaml存放路径

* 修复路径依赖bug & 添加 .gitignore文件

* fix bug - compat input args in save_combine op

* fix compat file

* fix set_value_with_tensor yaml

* split backward op in original yaml file

* add send_v2 & recv_v2

962f67d2

C
Fix custom device compile error caused by dist marco changing (#56760) · ddc81cc2
由 Chen Weihang 提交于 9月 01, 2023
```
* fix custom device errro by dist

* polish details
```
ddc81cc2

31 8月, 2023 3 次提交

【complex op】No.7 add complex support for isclose (#56723) · d53972fd

由 iSerendipity 提交于 8月 31, 2023

* add complex support for isclose

* add complex test for isclose

* fix template complie issue

* fix cuda compilation error

* fix type typo

* fix error for complex's abs

* add complex dtype into input

* fix ut

d53972fd

[NewIR]New ir using kernel registrer type (#56789) · a34bdb64

由 hong 提交于 8月 31, 2023

* update

* fix batch norm grad args def

* fix bug

* fix combine slice bug

* fix slice bug

* update builtin split

* disable using kernel resigter dtype

* polish code

* disable some test

a34bdb64

Add fused_scale_bias_relu_conv_bnstats OP (#55026) · 71e28b12

由 Tian Zheng 提交于 8月 31, 2023

* Add fused_scale_bias_relu_conv_bnstats op

* Review changes

* Fix no CUDNN Frontend build

* Fix PADDLE_ENFORCE format

* Fix PADDLE_ENFORCE CI error

* Rename kernel filename

* Refactor unittest to use paddle eager_op_test

* Fix padding bugs

* Review changes

* test=cuda117

* test=cuda117

71e28b12

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功