提交 · 4f403d3e3565d8c87c997f05f08d27b19c1109d1 · BaiXuePrincess / Paddle

17 9月, 2022 5 次提交
- Z
  
  update strategy (#46138) · 4f403d3e
  由 zhaoyingli 提交于 9月 17, 2022
  
  4f403d3e
- G
  Fix bug of reduce_sum op. (#46045) · 28b4240b
  由 Ghost Screaming 提交于 9月 17, 2022
```
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Fix some problems.
1. Change fluid head files to phi files.
2. Delete useless code.
3. Fix code style problems.

* Fix some code style problems.

* Fix some code style problems.
```
  28b4240b
- F
  [BugFix] fixed a bug in decorator transformer, it can not analyze decorator... · 22530137
  由 feifei-111 提交于 9月 17, 2022
```
[BugFix] fixed a bug in decorator transformer, it can not analyze decorator with params correctly (#46055)

* fix deco call

* add raise

* add test

* add warn, fix paddle api

* fix error type

* fix coverage
```
  22530137
- Y
  
  fix compilation errors on mac arm64 (#46117) · abe1dca3
  由 Yuanle Liu 提交于 9月 17, 2022
  
  abe1dca3
- fix static_check error when compile twice (#46140) · b680fb80
  由 zhouweiwei2014 提交于 9月 17, 2022
  
  b680fb80
16 9月, 2022 18 次提交
- S
  Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84
  由 sneaxiy 提交于 9月 16, 2022
```
* support int64 non-broadcast

* support broadcast case for int64 index

* fix bug

* support more Arity

* remove some codes

* upgrade patchelf to v0.15.0 to pass CI build

* fix bug

* fix patchelf installation

* add debug flags

* remove useless codes

* fix viterbi_decode and set_value op uts

* remove always enable int64
```
  20b5bf84
- M
  
  fix bug for TransformedDistribution (#46035) · 71c5ec87
  由 MayYouBeProsperous 提交于 9月 16, 2022
  
  71c5ec87
- C
  optimize device synchronization in profiler (#46089) · 2a5bd7dc
  由 chenjian 提交于 9月 16, 2022
```
* avoid to synchronize all devices

* synchronize custom device
```
  2a5bd7dc
- R
  this pr is for optimizing precise test (#46068) · 4e8ad06a
  由 risemeup1 提交于 9月 16, 2022
```
* this pr is for optimizing precise test

* modify get_pr_ut.py

* modify get_pr_ut.py
```
  4e8ad06a
- Z
  Clear extra attrs of scale in OpMaker (#45984) · 0a904f8b
  由 zyfncg 提交于 9月 16, 2022
```
* clear extra attr of scale in opmaker

* fix sum bug

* fix merge conflict

* fix minus
```
  0a904f8b
- W
  
  Remove redundant code in pe engine (#46110) · 5c3e8585
  由 WangZhen 提交于 9月 16, 2022
  
  5c3e8585
- J
  [Auto Parallel] Bugfix allreduce fuse for MP (#46086) · 134f9c3e
  由 JZ-LIANG 提交于 9月 16, 2022
```
* bugfix

* bugfix

* typos fixed
```
  134f9c3e
- J
  [Eager] Fix linspace error in amp (#46088) · 4fba3d5e
  由 Jiabin Yang 提交于 9月 16, 2022
```
* fix linspace error in amp

* fix log

* fix amp error
```
  4fba3d5e
- W
  
  Fix bugs in "test_custom_plugin_creater" unit test (#46075) · be00a42f
  由 weishengying 提交于 9月 16, 2022
  
  be00a42f
- J
  
  Modify callstacklevel flag for c++ (#46058) · d072aaeb
  由 JingZhuangzhuang 提交于 9月 16, 2022
  
  d072aaeb
- L
  add interpretercore for jit engine (#46092) · 22c3cdb4
  由 Leo Chen 提交于 9月 16, 2022
```
* add interpretercore for jit engine

* add ut
```
  22c3cdb4
- Z
  
  Correct spelling errors (#46108) · 08186f14
  由 Zhang Zheng 提交于 9月 16, 2022
  
  08186f14
- R
  [CustomDevice] add new executor support (#46038) · 268f097e
  由 ronnywang 提交于 9月 16, 2022
```
* [CustomDevice] add custom_device_resource_pool & device_event_custom_device

* update

* update

* update

* update
```
  268f097e
- J
  
  Correct order of passes (#45936) · cbda49e6
  由 joanna.wozna.intel 提交于 9月 16, 2022
  
  cbda49e6
- X
  support pow with scalar input, square, cast, var, size operators for deepxde (#46024) · 1711407d
  由 Xiaoxu Chen 提交于 9月 16, 2022
```
* add reduce_mean,reduce_sum primitive ops
* add ne_p gt_p primitive operators
* add ge_p abs_p primitive oparators
* add cast primitive operators
* add pow,square prim2oirg rules
* add elementwise_div orig2prim rule
```
  1711407d
- C
  Unify core avx and core_noavx to libpaddle (#46095) · 267d71a4
  由 Chen Weihang 提交于 9月 16, 2022
```
* unify  core_avx and core_noavx

* fix except error

* revert mac compile logic

* revert dylib to so

* add core_noavx branch

* remove core_noavx

* replace paddle_core by lib paddle

* polish var name

* replace paddle_core by libpaddle

* update custom device commit

* polish code by comments
```
  267d71a4
- W
  refactor mp. (#45803) · fa97e5ba
  由 wuhuachaocoding 提交于 9月 16, 2022
```
* refactor mp.

* update setup.py.

* update mp_layers.py for compatibility.

* add documents for mp_layers.py

* update init.py

* update collective.py.

* update.

* update mp_ops.py

* update.

* update code style.

* update code style.
```
  fa97e5ba
- W
  
  Support both use_calc_stream and sync_op in send recv APIs (#46023) · ae00f428
  由 Wen Sun 提交于 9月 16, 2022
  
  ae00f428
15 9月, 2022 17 次提交

由 ziyoujiyi 提交于 9月 15, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fix gloo compile warning

92e1f64b

H
[jit] skip forward save (#45901) · 483ba282
由 Hui Zhang 提交于 9月 15, 2022
```
* skip forward save

* fix bug

* more ci for jit skip forward
```
483ba282

[Auto Parallel] Improve the APIs (#45776) · b042a3b1

由 Yulong Ao 提交于 9月 15, 2022

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Add the serialization process for dist attrs

* [Auto Parallel] Remove unnecessary comments

* [Auto Parallel] Fix some bugs

* [Auto Parallel] Fix the code style

* [Auto Parallel] Remove unnecessary impls

* [Auto Parallel] Fix the importing error

* [Auto Parallel] Fix the copy from bugs of op dist attr

* [Auto Parallel] Replace the use of constexpr if

* [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh

* [Auto Parallel] Change API of the completion unittest

* [Auto Parallel] Fix the bug when set_attr an int

* [Auto Parallel] Add the unittest for the serialization

* [Auto Parallel] Add some unit tests

* [Auto Paralle] Unify the strategy

* [Auto Parallel] Improve the engine api

* [Auto Parallel] Reset the changes made to the framework

* [Auto Parallel] Change the engine unittest

* [Auto Parallel] Update API of the completion and partitioner

* [Auto Parallel] Update unit tests using engine api

* update shard annotation

* [Auto Parallel] Remove the modifications of other modules

* [Auto Parallel] Add docs for APIs

* add new strategy

* [Auto Parallel] Replace the logger

* [Auto Parallel] Restore the test_program.py

* [Auto Parallel] Change the import rules

* [Auto Parallel] Add the examples for Engine

* [Auto Parallel] Do some minor changes

* [Auto Parallel] Remove yaml dependency

* [Auto Parallel] Fix the unittests

* add valid after train

* bug fix
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>

b042a3b1

H
refine PADDLE_WITH_MKLDNN code (#46053) · ea96172e
由 HongyuJia 提交于 9月 15, 2022
```
* refine PADDLE_WITH_MKLDNN code

* fix data_norm_op

* polish addmm_op
```
ea96172e
G

remove tmp fp32 var for gaussian_random (#46033) · 3671d114
由 Guoxia Wang 提交于 9月 15, 2022

3671d114
N

Revert "Fix argsort in XPU black list for XPU KP (#45975)" (#46064) · f3206b09
由 niuliling123 提交于 9月 15, 2022

f3206b09

updating mul and matmul with set_mem_desc (#45624) · 416e0de7

由 Jacek Czaja 提交于 9月 15, 2022

* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix

416e0de7

N

[CodeStyle][W291] trim trailing whitespace in NPU unittest file (#46042) · 5022dd9b
由 Nyakku Shigure 提交于 9月 15, 2022

5022dd9b
N

[CodeStyle] trailing whitespace hook for doc and cpp related files (#46067) · 710efdae
由 Nyakku Shigure 提交于 9月 15, 2022

710efdae
傅

Optimize flip kernel by eliminating H2D data transfer, test=develop (#46046) · b3283f4c
由傅剑寒提交于 9月 15, 2022

b3283f4c
W

fix_recover_remove_padding kernel (#46050) · 65bdd80b
由 Wangzheee 提交于 9月 15, 2022

65bdd80b

Clear extra attrs of elementwise op in OpMaker (#45845) · b26efe0d

由 zyfncg 提交于 9月 15, 2022

* clear extra attrs of elementwise op in opmaker

* fix op_debug_string_test

* fix bug of grad_add

* fix sort of runtime attrs

b26efe0d

W
Support 0 shapes input Tensor for MKL slice (#45930) · 1d78681d
由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
1d78681d
L
Performance fix for broadcast kernel [Part3] (#45854) · f48b1264
由 limingshu 提交于 9月 15, 2022
```
* first commit

* fix some bugs in code

* fix bugs

* to optimize merge one dimension feature
```
f48b1264
N

[CodeStyle] trim trailing whitespace in .h, .cc, .cu, etc. (#46006) · 8dde7aea
由 Nyakku Shigure 提交于 9月 15, 2022

8dde7aea
W

General Plugin Mechanism (#45355) · bc77e6d5
由 weishengying 提交于 9月 15, 2022

bc77e6d5
W
[Eager] saved_tensors_hooks (#45763) · b294f054
由 wanghuancoder 提交于 9月 15, 2022
```
* saved_tensors_hooks
```
b294f054

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致