提交 · cad2e68de2c8f3e405d753884bb1c47d74983e3b · PaddlePaddle / Paddle

02 11月, 2022 5 次提交

[Zero-Dim] support input 0D Tensor for some binary api (#46909) · cad2e68d
由 zhouweiwei2014 提交于 11月 02, 2022

cad2e68d

Improve the tool for checking nan and inf, and support to compute the max, min... · ad39043f

由 Yiqun Liu 提交于 11月 02, 2022

Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095)

* Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor.

* Add a FLAGS to control whether abort when meets inf/nan and polish codes.

* Fix unittest.

* Change the computing of mean.

ad39043f

Z
Support generating static code of high order grad op by yaml (#47511) · bafa890a
由 zyfncg 提交于 11月 02, 2022
```
* support generating static code of high order grad op by yaml

* polish code
```
bafa890a

[XPU] add int64 support for slice and subtract. (#47409) · 77395619

由 houj04 提交于 11月 02, 2022

* [XPU] add int64 support for slice and subtract. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* remove unnecessary modification. test=kunlun

77395619

Add build option for CUDNN Frontend API (#47524) · eb100c7b

由 Tian Zheng 提交于 11月 02, 2022

* Add build option for CUDNN Frontend API

* Fix review comments

* Change namespace for cudnn_frontend.h

eb100c7b

01 11月, 2022 21 次提交

[CodeStyle][E711] use `is`/`is not` for comparison with `None` (#47452) · a35a4a53

由 Nyakku Shigure 提交于 11月 01, 2022

* [CodeStyle][E711] use `is`/`is not` for comparison with `None`

* `self.assertTrue($A is None)` -> `self.assertIsNone($A)`

* `self.assertTrue($A is not None)` -> `self.assertIsNotNone($A)`

* `self.assertFalse($A is None)` -> `self.assertIsNotNone($A)`

* `self.assertEqual($A, None)` -> `self.assertIsNone($A)`

* `self.assertNotEqual($A, None)` -> `self.assertIsNotNone($A)`

a35a4a53

fix dynamic link of xpu library (#47434) · 9d801855

由 Leo Chen 提交于 11月 01, 2022

* refine comments,test=kunlun

* link xpu lib, test=kunlun

* add sleep for test, test=kunlun

* merge develop, fix compile, test=kunlun

* remove debug code, test=kunlun

* add dependency to avoid potential concurrency error, test=kunlun

9d801855

[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045

由 HongyuJia 提交于 11月 01, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

f9134045

Y

[Paddle Inference] add RegisterOutputHook interface (#47050) · db323927
由 Yuanle Liu 提交于 11月 01, 2022

db323927
H

clean mkldnn headerfile (#47507) · a341bb8c
由 HongyuJia 提交于 11月 01, 2022

a341bb8c
S

[geometric] Optimize graph sample speed (#47531) · 2a932e55
由 Siming Dai 提交于 11月 01, 2022

2a932e55

Fix bugs in tranpose kernel (#47212) · ec7fe888

由 limingshu 提交于 11月 01, 2022

* first commit

* transpose_kernel_optimization

* first complishment of transpose op

* second commit

* refine code logics of tranpose_kernel

* refine transpose kernel

* first commit

* fix DtoD copy bugs for hip

* refine code according to the PR advice

* change dim to int64_t type.

* fix some type error

ec7fe888

Y
[PHI]Standardise some C++ API (Part2) (#47510) · 399047d7
由 YuanRisheng 提交于 11月 01, 2022
```
* standard_api

* add hardtanh
```
399047d7
S

fix (#47537) · 957fbb02
由 shentanyue 提交于 11月 01, 2022

957fbb02

[CodeStyle][E712] use `if cond`/`if cond is True` for comparison with `True` (#47464) · 5a2ab683

由 Nyakku Shigure 提交于 11月 01, 2022

* [CodeStyle][E712] use `if cond`/`if cond is True` for comparison with `True`

* revert changes in fluid

* revert unrelated file

* revert changes in norm

* revert changes in auto_parallel_amp

* fix norm and auto_parallel_amp

* revert a typo fix due to fixed at #47477

5a2ab683

Support custom stream for standalone executor (#47411) · e12b6c04

由 Ruibiao Chen 提交于 11月 01, 2022

* [Auto Parallel] Improve the c++ dist attr

* [Auto Parallel] Modify test_program.py

* Support custom stream for standalone executor
Co-authored-by: NYulong Ao <aoyulong@baidu.com>

e12b6c04

[EinsumOp] Einsum support complex grad (#47514) · e930c576

由 xiongkun 提交于 11月 01, 2022

* Einsum Support Complex

* code fix

* add unittest for complex grad with einsum

* set rtol=1e-4

* fix

e930c576

[CodeStyle][py2] remove `six` package (part2) (#47334) · 3592ba8c

由 Nyakku Shigure 提交于 11月 01, 2022

* [CodeStyle][py2] remove `six` package (part2)

* six.ensure_str

* remove unused `import six`

* remove six from BUILTIN_LIKELY_MODULES

* remove six in example code

* remove some decode

* try to fix example code

* fix MockEtcdClient get/get_prefix returns data type

* fix MockEtcdClient get_prefix returns data

* fix MockEtcdClient get returns data

* remove `six` in pypi and conda requirements

* fix MockEtcdClient add_watch_callback/add_watch_prefix_callback returns data type

* refine MockEtcdClient

3592ba8c

K
fix memory copy in prepare_data of FusedMultiTransformer pass (#47306) · 9ad0e37e
由 Kaipeng Deng 提交于 11月 01, 2022
```
* fix memory copy in prepare_data. test=develop
```
9ad0e37e
S

[Lite][XPU] Upgrade lite subgraph api of xpu (#47373) · 8a1124b1
由 shentanyue 提交于 11月 01, 2022

8a1124b1
F

fix:add no support for cuda_arch<700 (#47509) · 974f8f32
由 feng_shuai 提交于 11月 01, 2022

974f8f32
W

remove unused-local-typedefs warning on linux (#47513) · 96f36962
由 Wang Xin 提交于 11月 01, 2022

96f36962

Generate static graph code for some activation ops by Yaml (part2) (#47440) · c5d99138

由 zyfncg 提交于 11月 01, 2022

* gene static graph code for ceil, expm1 op

* gene static graph code for some activation op

* fix bug

* revert doc of silu and logsigmoid

c5d99138

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

Y

fix p2p comm memory release logic (#47497) · f82d7e3c
由 Yuang Liu 提交于 11月 01, 2022

f82d7e3c
U

summer-ospp 2022: 飞桨PaddlePaddle Sparse Conv开发和优化: gather-gemm-scatter fuse (#46679) · 5158fa4f
由 umiswing 提交于 11月 01, 2022

5158fa4f

31 10月, 2022 13 次提交

Y
[PHI]Standardise some C++ API (#47385) · 60e0c506
由 YuanRisheng 提交于 10月 31, 2022
```
* standard api

* fix ci bugs

* fix ci bugs

* fix ce bugs
```
60e0c506

[Einsum] Einsum support repeated labels. (#47290) · 6e1c14e3

由 xiongkun 提交于 10月 31, 2022

* add unittest for einsum-v2-trace and diagonal

* repeat labels.

* einsum support repeated labels.

* forward is ok for diagonal and undiagonalized.
TODO: check backward is ok by our theorem.

* backward is ok!

* fix by PR suggestions.

* fix ci error

* fix ci error

* fix ci warning

6e1c14e3

W
fix predictor memory write overflow (#47485) · de4a7911
由 wanghuancoder 提交于 10月 31, 2022
```
* fix predictor memory write overflow
```
de4a7911
F
feat: add int8 support for vit (#47330) · 2953b708
由 feng_shuai 提交于 10月 31, 2022
```
* feat: add int8 support for vit

* test:add test
```
2953b708
R
[CustomDevice] GetCCLComm add custom device support (#47168) · 34d13d6a
由 ronnywang 提交于 10月 31, 2022
```
* [CustomDevice] GetCCLComm add custom device support

* update

* update

* update
```
34d13d6a

optimize: vit 384 (#47432) · 520adc0e

由 feng_shuai 提交于 10月 31, 2022

* optimize: vit 384

* fix:bug

* fix:bug

* fix:supoort rocm complie

* refactor:name

* fix:support rocm

* fix:__HIP_NO_HALF_CONVERSIONS__

* optimize: delete scalar

* fix:rocm can't support

* fix:ernie error

520adc0e

[Auto Parallel] Improve the c++ dist attr (#47358) · b03b4a3c

由 Yulong Ao 提交于 10月 31, 2022

* [Auto Parallel] Improve the c++ dist attr

* [Auto Parallel] Modify test_program.py

* [Auto Parallel] Add the missiong import

b03b4a3c

[ControlFlow] replace executor in run method of control flow ops with standalone_executor (#45696) · 3b219e5e

由 kangguangli 提交于 10月 31, 2022

* replace executor in conditional_block_op.run with standalone_executor

* add block_id as the argument of standalone executor's method run; add print for program

* fix scope bug about conditional block op

* fix bug: unnecessary return of fetch value

* fix typo

* fix: quantization will set variable persistable, and these variables must exist in global scope

* add interpretercore cache for conditional block op but not activate in default

* fix bug: local scope reuse for conditional block op

* reset scope when conditional block op runs

* fix typo

* fix typo and code style

* add build scope for conditional block op

* add skip for transfer_layout kernel

* refind code

* fix reset_scope

* fix reset_scope

* refine code

* refine code

* refine code

1. remove flag use in conditional_block_op
2. pass execution_config to BuildOpFuncList instead of individual parameter

* refine code

* remove the use of FLAGS_control_flow_use_new_executor_cache

* change FLAGS_control_flow_use_new_executor to false

3b219e5e

C

[MLU] fix compile error & add mlu blacklist function. (#47439) · bb6356e8
由 Chenxiao Niu 提交于 10月 31, 2022

bb6356e8
N
fix typos for `True` and `False` (#47477) · f5912d0c
由 Nyakku Shigure 提交于 10月 31, 2022
```
* fix typo `Fasle`/`Flase` -> `Flase`

* fix typo `Ture` -> `True`
```
f5912d0c
[Zero-Dim] support input 0D Tensor for reduce_sum/reduce_mean (#47219) · c8fc3379
由 zhouweiwei2014 提交于 10月 31, 2022

c8fc3379
Z
fix python module not found bug (#47438) · 81b93ebb
由 zhangbo9674 提交于 10月 31, 2022
```
* fix python module not found bug

* delete unused cast,test=allcases
```
81b93ebb
W

remove boost compiler flags in flags.cmake (#47468) · 91096ae2
由 Wang Xin 提交于 10月 31, 2022

91096ae2

28 10月, 2022 1 次提交
- Z
  
  generate static graph code for some ops by yaml (#47416) · 17fb92b3
  由 zyfncg 提交于 10月 28, 2022
  
  17fb92b3

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功