提交 · bea1f04c300ab932fd96f27ee1304f507e7bb6ba · PaddlePaddle / Paddle

24 7月, 2023 7 次提交

onednn: remove fc_elementwise_add fusion (#55504) · bea1f04c

由 Xinyu Chen 提交于 7月 24, 2023

* onednn: remove fc+eltwiseadd fusion pass
* onednn: remove post-sum fusion in fc kernel
* onednn: tests: make unfused add run into f32

bea1f04c

傅

delete modification on pre-commit (#55519) · 5b8f0637
由傅剑寒提交于 7月 24, 2023

5b8f0637
U
Fix test_sparse_norm_op failure. (#55405) · 4ff8fca5
由 umiswing 提交于 7月 24, 2023
```
* Fix test failed on cudnn.

* Fix codestyle.
```
4ff8fca5
Y
[Semi-Auto] add split spmd rule (#55397) · cf76e7ae
由 Yichen Zhang 提交于 7月 24, 2023
```
* add split spmd rule

* add pytest in cmake file

* small fix
```
cf76e7ae

Order print attribute map (#55518) · 1f3e6ec4

由 xingmingyyj 提交于 7月 24, 2023

* fix_ir_printer

* Update ir_printer.cc

* Update ir_printer.cc

* Update ir_printer.cc

* Update ir_printer.cc

* Update ir_printer.cc

* Update paddle/ir/core/ir_printer.cc
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>

* Update ir_printer.cc

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>

1f3e6ec4

[AutoParallel] Add shard tensor and DistAttr api (#55494) · bd60757d

由 Chen Weihang 提交于 7月 24, 2023

* add shard tensor api

* add DistAttr api

* add unittest for coverage

* fix process mesh sample code

* fix checking error

bd60757d

张
[IR Dialect] ⚔Elden chapter 1.1⚔ (#55525) · 2b8e6285
由张春乔提交于 7月 24, 2023
```
* IntArrayAttributeStorage
```
2b8e6285

22 7月, 2023 3 次提交
- fix group_shard3_get_all_parameter (#55572) · 6da9db50
  由 zhenhailiu 提交于 7月 22, 2023
  
  6da9db50
- S
  Fix launch error when PADDLE_TRAINER_ENDPOINTS is too long (#55478) · db921ae9
  由 sneaxiy 提交于 7月 22, 2023
```
* fix new launch

* fix ps uit
```
  db921ae9
- R
  
  [PHI CAPI] Add support for registering a new operator, PART2 (#55533) · 14006e96
  由 ronnywang 提交于 7月 22, 2023
  
  14006e96
21 7月, 2023 8 次提交
- K
  [NewIR][BugFix] fix empty_var_name problem (#55546) · de3e9c30
  由 kangguangli 提交于 7月 21, 2023
```
* fix empty_var_name problem

* fix coverage ci

* fix coverage ci
```
  de3e9c30
- Y
  [Inference] save_optimized_model_pass support gpu (#55551) · 4b3ac86d
  由 Yuanle Liu 提交于 7月 21, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support gpu
```
  4b3ac86d
- R
  
  [clang-tidy] enable modernize-make-unique (#55506) · 45d49619
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  45d49619
- R
  
  [clang-tidy] enable modernize-use-override (#55491) · cd0f1523
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  cd0f1523
- Z
  [OpCompat] add assert in op_compat.yaml (#55500) · 2d98758c
  由 Zhan Rongrui 提交于 7月 21, 2023
```
* add assert

* fix

* fix
```
  2d98758c
- Q
  开发grad_fn、next_functions两个API 并暴露到python端- 修改单侧文件路径到合理位置 (#55311) · 03f06841
  由 qiuwenbo 提交于 7月 21, 2023
```
* [尝试] 给tensor增加一个属性, 这个属性是一个定值 1

* 暴露gradnode 并构建gradnode新的方法(用来测试)进行暴露给python python端可以访问

* 开发grad_fn、next_functions两个API 并暴露到python端- 做一些规范化处理

* 增加一个单元测试

* 优化 code-style

* 将单侧文件迁到正确的位置

* 优化 code-style

* 删除无用注释

* 解决 __main__ has no attribute

* 修改单侧文件

* 修改单侧脚本-temp
```
  03f06841
- J
  Bugfix, CUB regression in CUDA 12.2 (#55594) · b2c797ad
  由 Jeng Bai-Cheng 提交于 7月 21, 2023
```
Issue #55016
```
  b2c797ad
- H
  
  [0D-Tensor] CINN supports argmin, fix infershape (#55505) · da2395f3
  由 HongyuJia 提交于 7月 21, 2023
  
  da2395f3
20 7月, 2023 22 次提交

L

polish some code (#55583) · f172b02f
由 Leo Chen 提交于 7月 20, 2023

f172b02f

Clean unused old graph compiler (#55484) · b0193f3a

由 Fisher 提交于 7月 20, 2023

In preparation for the improvement of the graph compiler, the deprecated old graph compiler was cleaned up.

b0193f3a

【静态图性能优化】图依赖信息复用 (#55389) · ee65599e

由 Sonder 提交于 7月 20, 2023

* add share api for DependencyBuilder

* add judge codes for sharing build results

* add ShareBuildResultsFrom

* update ShareDependencyFrom

* fix error

* add share codes

* fix memory error

* update according review

* update notes

* fix code style

* remove const_cast

* fix code style

ee65599e

[NewIR]Change feed list to variable list && support GPU (#55401) · 75517841

由 hong 提交于 7月 20, 2023

* add feed with place op

* remove useless unitest

* udpate mkldnn

* update

* new ir support builtin slice op

* fix phi kernel adaptor bug

* add enable_static

* remove useless test case

* change feed list to single variable

* support gpu

* fix bug

* remove template

* add more data type

* fix cimpile bug

75517841

L
Fix UT failure (#55360) · 7eeff7b1
由 Leo Chen 提交于 7月 20, 2023
```
* Fix TRT multihead matmul UT failure
```
7eeff7b1
N

Add fuse_linear_activation (#55420) · fa084e5e
由 niuliling123 提交于 7月 20, 2023

fa084e5e
H

[0D-Tensor] CINN supports gaussian_random (#55547) · 30f059d6
由 HongyuJia 提交于 7月 20, 2023

30f059d6
H

[0D-Tensor] CINN supports topk, sort, argsort, fix infershape (#55510) · 5bfbaa8b
由 HongyuJia 提交于 7月 20, 2023

5bfbaa8b
W
[cmake] add third party jitify cache (#55501) · 7341e6fc
由 Wang Xin 提交于 7月 20, 2023
```
* [cmake] add third party jitify cache

* fix bug

* fixed

* fix bug
```
7341e6fc
K
[NewIR]add hsigmoid loss in whitelist (#55496) · c63aba9e
由 kangguangli 提交于 7月 20, 2023
```
* fix hsigmoid_loss

* add test into whitelist

* fix whitelist
```
c63aba9e

Update gloo in dygraph (#55537) · 1d1e5484

由 Xing-lil 提交于 7月 20, 2023

* update broadcast gloo in dygraph

* update

* update reduce gloo in dygraph

* update reduce gloo in dygraph

* update

* update allreduce allgather

* update all

* update

* update

* update

1d1e5484

C

add check_approval topaddle/fluid/framework/new_executor (#55542) · 982e0a9d
由 Chen Zhiyang 提交于 7月 20, 2023

982e0a9d
N

[Dy2St] fix `func_self` maybe a callable empty list (#55554) · 3b58a68f
由 Nyakku Shigure 提交于 7月 20, 2023

3b58a68f
Z

[XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
由 zhupengyang 提交于 7月 20, 2023

4df00939
Q

[ARM] fix arm build failure with Ninja build, test=develop (#55548) · 4f307a7e
由 Qi Li 提交于 7月 20, 2023

4f307a7e
Z

rename hard_sigmoid to hardsigmoid for kernel name (#55559) · c3080386
由 zyfncg 提交于 7月 20, 2023

c3080386

[XPU][PHI Kernels] bind reduce_max_int64 set_value_bool sin_grad_fp32... · ab00c96c

由 lijin23 提交于 7月 20, 2023

[XPU][PHI Kernels] bind reduce_max_int64 set_value_bool sin_grad_fp32 cos_grad_fp32 for XPU (#55375)

* bind kernels for xpu

* format code

* format code

* 0d support for set value

* refine set_value

ab00c96c

M

fix bug of constant folding pass (#55556) · bc61c796
由 ming1753 提交于 7月 20, 2023

bc61c796

[Kunlun] Modify some legacy code on distributed training (#55515) · 806f8d2b

由 XiaociZhang 提交于 7月 20, 2023

* [Kunlun] Mofify some legacy code on distributed training

There were limitations on XPUs before, such as concat/split is not
supported, and c_broadcast only support fp32. These limitations are
lifted recently.

Multi-device profiling on XPU will also be supported by this PR.
Without this PR, a hanging broadcast will be issued by devices that
enables profiling, eventually lead to kernel timeout error.

* fix typo

806f8d2b

shard grad reduce (#55495) · 284e0d12
由 zhenhailiu 提交于 7月 20, 2023

284e0d12

[Semi Auto] Entropy SPMD Rule (#55394) · 5f376f00

由 JZ-LIANG 提交于 7月 20, 2023

* base rule

* add sharidng merge

* add sharidng axis merge

* define unified data class for inferencing dist_attr

* test wrap DistTensorSpec in dygraph mode

* matmul main logic done

* shape int64

* common cc

* define unified data class for inferencing dist_attr

* test wrap DistTensorSpec in dygraph mode

* define python api and wrap function in static mode for DistTensorSpec

* revise syntax

* map bugfix

* broadcast func

* compile 1

* add unitest

* add registry

* update unitest

* bugfix

* bugfix

* add pybind

* bugfix

* bugfix macro gloabl name space

* bugfix macro gloabl name space

* pybind

* pybind test

* pybind bugfixed1

* pybind bugfixed2

* pybind unitest

* merge dev

* merge dev

* merge dev

* fixed cmake conflict

* fixed cmake conflict

* rename get method

* revise inferforward output type

* revise comment

* replicated rule

* replicated rule 2

* revert bug deps

* add rule

* add unitest

* add rule

* add unitest

* move ut of auto_parallel

* fix ut

* bugfix

* bugfix

* bugfix

* bugfix

* bugfix

* bugfix

* bugfix

* resolute input sharding conflict maybe

* fixed comment

* add rule

* add unitest

* fixed typoes

---------
Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

5f376f00

K

fix data load error in static mode (#55541) · 746e7cdc
由 Kai Song 提交于 7月 20, 2023

746e7cdc

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功