提交 · 993bc412b85114419b0e3baa30583f109cae437b · PaddlePaddle / Paddle

23 4月, 2023 2 次提交

apply gcc12 to gpups (#52960) · cbfd43e4

由 risemeup1 提交于 4月 23, 2023

* apply gcc12 to gpups

* apply gcc12 to gpups

* apply gcc12 to gpups

* apply gcc12 to gpups

* apply gcc12 to gpups

* apply gcc12 to gpups

* apply gcc12 to gpips

* apply gcc12 to gpups

* apply gcc12 to gpups

* test

* test

* apply gcc12 to gpups

* apply_gcc12_to_gpups

* fix compiler bug

* fix compiler bug

* test

* fix dangling-pointer compiler

* fix dangling-pointer compiler

* fix dangling-pointer compiler

* apply_gcc12_to_gpups

* apply gcc12 to gpups

* Update cuda_streams_py.cc

cbfd43e4

N
Delete temp param in eager_gen (#53047) · 328195d7
由 niuliling123 提交于 4月 23, 2023
```
* Delete temp param in eager_gen
```
328195d7

21 4月, 2023 1 次提交

support 0-D output and 0-D as indice in __getitem__/__setitem__ (#52814) · 4e939c89

由 JYChen 提交于 4月 21, 2023

* support 0-D output and 0-D as indice in __getitem__

* fix tests

* fix inference and UT

* add unittest for setitem

* fix xpu test

* fix xpu 0-d

4e939c89

20 4月, 2023 2 次提交
- H
  
  [Fix typo] Fix typo error in eager_function.cc and data_type.h (#52932) · c91c9edb
  由 HongyuJia 提交于 4月 20, 2023
  
  c91c9edb
- H
  [CustomOP error] Add attrs type check (#53030) · 195d6d0f
  由 HongyuJia 提交于 4月 20, 2023
```
* [CustomOP error] Add attrs type check

* fix global variable order bug

* include unordered_set

* fix ParseAttrStr compile error
```
  195d6d0f
19 4月, 2023 1 次提交
- R
  [CustomDevice] add recompute support (#53044) · 3206fa80
  由 ronnywang 提交于 4月 19, 2023
```
* [CustomDevice] add recompute support

* update
```
  3206fa80
18 4月, 2023 2 次提交
- N
  
  Print the forward's stack when backward op has nan/inf and FLAGS_check_nan_inf_level = 0 (#52639) · 660f781b
  由 niuliling123 提交于 4月 18, 2023
  
  660f781b
- 张
  
  remove mlu(#53007) · 4d5a3ad6
  由张春乔提交于 4月 18, 2023
  
  4d5a3ad6
17 4月, 2023 3 次提交
- L
  cherry-pick fleet executor from 2.4 (#52896) · bafe287a
  由 LiYuRio 提交于 4月 17, 2023
```
* cherry-pick fleet executor from 2.4

* fix test case
```
  bafe287a
- J
  
  Support trt engine auto build in runtime for dynamic shape (#52162) · ebc58548
  由 JingZhuangzhuang 提交于 4月 17, 2023
  
  ebc58548
- 张
  
  remove hccl in some .cc files (#52942) · 514d83de
  由张春乔提交于 4月 17, 2023
  
  514d83de
14 4月, 2023 3 次提交

1. modify set_value op, use Scalars to represent attr `values`, instead of a... · dd2a749a

由 Feiyu Chan 提交于 4月 14, 2023

1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408)

2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition);
3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version;
3. provide an option `legacy_format=false` in serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute;
4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.

dd2a749a

K

rem cncl (#52434) · 25bd5ed8
由 Kim Yann 提交于 4月 14, 2023

25bd5ed8
R

[CustomDevice] add model parallel support for custom device (#52872) · f8d09011
由 ronnywang 提交于 4月 14, 2023

f8d09011

13 4月, 2023 1 次提交
- Y
  
  fix bug only on win (#52839) · 84d34ddd
  由 Yuanle Liu 提交于 4月 13, 2023
  
  84d34ddd
12 4月, 2023 1 次提交
- L
  
  Add layer func: float(), half(), bfloat16(). (#51635) · a64d50b7
  由 liuruyan 提交于 4月 12, 2023
  
  a64d50b7
11 4月, 2023 3 次提交
- Y
  
  [Paddle Inference] Predictor support paddle::Tensor (#50445) · 10fd4a95
  由 Yuanle Liu 提交于 4月 11, 2023
  
  10fd4a95
- X
  
  [prim]use Operator to reconstruct the primitive operator defined in c++ (#51997) · dd74b3d1
  由 Xiaoxu Chen 提交于 4月 11, 2023
  
  dd74b3d1
- W
  
  [BUG Fixs] adadelta lr support (#49732) · 23032590
  由 wangzhen38 提交于 4月 11, 2023
  
  23032590
10 4月, 2023 4 次提交

[AMP] support master_grad for amp training (#52235) · 4970dd65

由 Zhang Ting 提交于 4月 10, 2023

* support set master_grad

* move register_hook to auto_cast

* update unittest

* fix fp16 test

* update for review comments

4970dd65

[Opt Performance] Optimize custom operator performance (#52597) · 01247e33

由 HongyuJia 提交于 4月 10, 2023

* [Opt Performance] Optimize custom operator performance, reconstruct python API auto-gen, add cache and use const inference

* opt AutoGradMeta implementation

* remove profiler codes

* fix unit test

* change year, 2021->2023

* fix int64_t parse bug

01247e33

[StandaloneExe] Remove flag about Executor (#52671) · d6ee0a13

由 kangguangli 提交于 4月 10, 2023

* add strategy force_sequential_run

* remove flag

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

d6ee0a13

张

Remove WITH_ASCEND (#52669) · 0f3bbe10

由张春乔提交于 4月 10, 2023

* mv WITH_ASCEND_CL

* mv WITH_ASCEND

* rollback

* remove WITH_ASCEND

* remove WITH_ASCEND

0f3bbe10

08 4月, 2023 2 次提交
- K
  [StandaloneExe] add strategy force_sequential_run (#52652) · e1692dc7
  由 kangguangli 提交于 4月 08, 2023
```
* add strategy force_sequential_run

* fix

* fix

* fix

* fix

* fix
```
  e1692dc7
- 张
  昇腾和寒武纪相关代码退场 WITH_ASCEND_CL (#52612) · 2b40434e
  由张春乔提交于 4月 08, 2023
```
* mv WITH_ASCEND_CL

* mv WITH_ASCEND

* rollback
```
  2b40434e
07 4月, 2023 1 次提交
- W
  
  clean up WITH_MLU (#52546) · e75c01f9
  由 Wang Xin 提交于 4月 07, 2023
  
  e75c01f9
06 4月, 2023 3 次提交
- 张
  
  mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
  由张春乔提交于 4月 06, 2023
  
  80dd1672
- 陈
  
  【昇腾和寒武纪相关代码退场】No.9 清理 PADDLE_WITH_ASCEND 相关代码 (#52403) · 262ea02a
  由陈沧夜提交于 4月 06, 2023
  
  262ea02a
- K
  rem is_compiled_with_npu (#52385) · 7976e2a3
  由 Kim Yann 提交于 4月 06, 2023
```
* rem is_compiled_with_npu

* rem nup related code

* make lint happy

* rem test

* remove some tests

* Update grad_scaler.py

* fix an error
```
  7976e2a3
05 4月, 2023 1 次提交
- fix Tensor.item to np.array(Tensor).item (#52483) · d95eaa17
  由 zhouweiwei2014 提交于 4月 05, 2023
  
  d95eaa17
04 4月, 2023 1 次提交

Add Gloo Gather Function (#52334) · 5f6376b7

由 yuehuayingxueluo 提交于 4月 04, 2023

* add gloo gather

* add gloo_tools

* fix CI bug

* use gloo gather

* remove redundant code

* fix process_group_gloo.py

* rename send_recv

* fix conflict

* fix conflict

* fix codestyle

* fix CI bug

* add PADDLE_ENFORCE_NE

5f6376b7

03 4月, 2023 2 次提交
- remove WITH_ASCEND_CL PADDLE_WITH_ASCEND_CL WITH_ASCEND_CXX11 (#52448) · 0b60f28c
  由 engineer1109 提交于 4月 03, 2023
  
  0b60f28c
- K
  rem is_compiled_with_mlu (#52378) · 4b28f4ff
  由 Kim Yann 提交于 4月 03, 2023
```
* rem is_compiled_with_mlu

* fix some mlu_place and mlu_device_coount

* make lint happy
```
  4b28f4ff
01 4月, 2023 2 次提交
- J
  Delete the /paddle/fluid/platform/device/npu directory (#52384) · 69436bf5
  由 jjyaoao 提交于 4月 01, 2023
```
* Delete the /paddle/fluid/platform/device/npu directory

* clear Cmakelists

* Try removing npu in the header file
```
  69436bf5
- F
  
  enable setting double attribute into opdesc (#52406) · 8a4aee18
  由 Feiyu Chan 提交于 4月 01, 2023
  
  8a4aee18
31 3月, 2023 2 次提交

gather with doc (#52105) · 77d24854

由 zhenhailiu 提交于 3月 31, 2023

* gather with doc

* resolve comment

* polish

* polish

* code style

* polish doc

* add_test

* polish

* polish

* add test check

* add test check

* polish

* polish

* polish

* polish

* fix_time_out

* polish

* fix timeout

* fix_timeout

* polish

* polish

* polish

* polish

* polish

77d24854

[CustomOP Optional Inplace] Custom op supports inplace optional tensor (#52216) · fcd77346

由 HongyuJia 提交于 3月 31, 2023

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete custom_inplace_setup.py

* [CustomOP Optional Inplace] Custom operator supports inplace optional Tensor input

* fix bug for vector<Tensor> inplace test

fcd77346

30 3月, 2023 3 次提交

[Bug-fix] fix bug of Tensor.item() when CUDAPinnedPlace (#52322) · 0f9ec013
由 zhouweiwei2014 提交于 3月 30, 2023

0f9ec013

support complex data types for libpaddle.Tensor's element get and set (#52324) · 13b12457

由 Feiyu Chan 提交于 3月 30, 2023

1. add type caster for paddle's complex type, to allow pybind to automatically cast it with python's complex type;
2. add complex64 and complex128 data type for `libpaddle.Tensor`'s element get and set(which is required to perturb an element to get the numerical derivative)
3. add support for cuda pinned place in `libpaddle.Tensor` element get and set

---
4. fix a bug in op code generation.(Creation of output folder in concurrent with parsing op yamls.)

13b12457

[AMP] Add python API for collecting operator stats. (#52215) · 73544322

由 Yiqun Liu 提交于 3月 30, 2023

* [AMP] Add python API for collecting operator stats.

* Fix import and polish codes.

* Add more unittest.

* Add doc for the new APIs.

73544322

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功