提交 · c09b1d68b9afe78435ae7af76f86ace6da9ee9db · Crayon鑫 / Paddle

22 4月, 2022 2 次提交
- A
  [IPU] add mixed-precission support for ipu (#41733) (#41906) · c09b1d68
  由 Allen Guo 提交于 4月 22, 2022
```
add mixed-precission support for ipu

cherry-pick from #41733
```
  c09b1d68
- H
  Change CINN tag, prepare for CINN release/v0.2 (#42065) · fd9c7818
  由 Huihuang Zheng 提交于 4月 22, 2022
```
Change CINN Tag to Prepare for CINN release/v0.2. This PR is the cherrypick of #42063
```
  fd9c7818
21 4月, 2022 23 次提交
- W
  
  [Eager] Support numpy.narray as input for eager expand (#42043) (#42064) · ef0b5fdc
  由 Weilong Wu 提交于 4月 21, 2022
  
  ef0b5fdc
- W
  
  [Eager] remove useless logic (#42020) (#42061) · 218e759b
  由 Weilong Wu 提交于 4月 21, 2022
  
  218e759b
- Z
  [Cherry-Pick]Move pass optimizations into CINN. (#42047) (#42070) · 2f2f987c
  由 Zhen Wang 提交于 4月 21, 2022
```
* Move pass optimizations into CINN.
```
  2f2f987c
- R
  [Cherry-pick]Release2.3/fix doc of nms op (#42024) · dbdb56d1
  由 RichardWooSJTU 提交于 4月 21, 2022
```
* fix nms op doc missing default value

* fix nms op doc add blank line
```
  dbdb56d1
- Z
  
  support bce_loss and bce_loss_grad in XPU, test=kunlun (#41610) · b1ba98ca
  由 zhangyikun02 提交于 4月 13, 2022
  
  b1ba98ca
- [cherry-pick]support multi_layer of bilstm,*test=kunlun (#42076) · 58f6d459
  由 z8hanghuan 提交于 4月 21, 2022
```
* modify xpu.cmake,*test=kunlun (#41832)

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* support bilstm,*test=kunlun

* [cherry-pick]support multi_layer of bilstm,*test=kunlun
```
  58f6d459
- L
  [Cherry-pick] fix the bug for nccl barrier and alltoall (#42042) · 8a12f459
  由 lilong12 提交于 4月 21, 2022
```
* fix_nccl_barrier (#41970)

* be compatible with the old version of alltoall (#42007)
Co-authored-by: NBaibaifan <39549453+Baibaifan@users.noreply.github.com>
```
  8a12f459
- L
  
  fix bug for eager mode distributed training (#41841) (#41953) · f5a937eb
  由 lilong12 提交于 4月 21, 2022
  
  f5a937eb
- L
  
  update (#41636) (#41757) · 0ef694ac
  由 lilong12 提交于 4月 21, 2022
  
  0ef694ac
- S
  Fix pipeline in new dygraph (#41937) (#42053) · 7eae6570
  由 ShenLiang 提交于 4月 21, 2022
```
* fix utest

* fix time
```
  7eae6570
- W
  
  fix inf in fused_attention (#41933) (#42032) · 50fd2450
  由 WangXi 提交于 4月 21, 2022
  
  50fd2450
- W
  double accessor and show_scale (#41943) (#42014) · efaef31a
  由 wangguanqun 提交于 4月 21, 2022
```
* double accessor and show_scale

* double accessor and show_scale

* rename

* fix bug in pslib config

* add unittest
```
  efaef31a
- B
  update gpu fp16 op blacklist (#41703) (#42051) · 97104695
  由 baoachun 提交于 4月 21, 2022
```
* update gpu fp16 op blacklist

* update blacklist
```
  97104695
- B
  
  add mkldnn int8 pass [step1] (#41579) (#42045) · 04f20b83
  由 baoachun 提交于 4月 21, 2022
  
  04f20b83
- Z
  [cherry-pick] Adjust the Phi C++ API and yaml (#41576, #41778, #41909) (#41928) · d24a402e
  由 zyfncg 提交于 4月 21, 2022
```
* [PHI] Support some c++ api in paddle namespace (#41778)

* support some c++ api in paddle namespace

* change c++ api namespace in custom op

* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576)

* support setting vector out size in yaml

* support setting size of vector<tensor> for out in yaml

* add data transform config for shape and size (#41909)

* fix api_gen bug
```
  d24a402e
- C
  [Cherry-pick] Optimize dygraph scheduling performance (#42010) · ec1d2a16
  由 Chen Weihang 提交于 4月 21, 2022
```
* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576)

* support setting vector out size in yaml

* support setting size of vector<tensor> for out in yaml

* resolve conflict
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
```
  ec1d2a16
- A
  [Eager]Fix full_like/clip with np.generic type as attribute (#41808) (#41974) · e4cb897e
  由 Aurelius84 提交于 4月 21, 2022
```
* [Eager]Fix full_like/clip with np.generic type as attribute

* support numpy genertic

* remove usless code
```
  e4cb897e
- T
  [cherry-pick] enable auto-tune when using cinn (#41795) (#42006) · f5d356b8
  由 TeFeng Chen 提交于 4月 21, 2022
```
cherry-pick #41795
```
  f5d356b8
- B
  
  update demo_ci ut threshold (#41981) (#42030) · efddf9ea
  由 baoachun 提交于 4月 21, 2022
  
  efddf9ea
- J
  
  fix adaptive pool pass bug (#42022) · 5b9cdd9b
  由 JingZhuangzhuang 提交于 4月 21, 2022
  
  5b9cdd9b
- J
  [Cherry-pick] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode (#41994) · af7439ad
  由 Jiabin Yang 提交于 4月 21, 2022
```
* cherry-pick python/paddle/utils/code_gen/backward.yaml

* remove unsupported yaml
Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
```
  af7439ad
- J
  [Eager] make fast through to linear (#41945) (#41995) · 0c141322
  由 Jiabin Yang 提交于 4月 21, 2022
```
* make fast through to linear

* make fast through to linear

* add to do for later upgrades

* support build once for now
```
  0c141322
- C
  [Cherry-pick] Polish custom op details (#42008) · f637e3d2
  由 Chen Weihang 提交于 4月 21, 2022
```
* polish tensor api details (#41971)

* [CustomOp] Fix custom op pinned input error (#41972)

* fix custom op pinned input error

* fix compile error

* fix inference custom op (#41999)

* resolve conflict
```
  f637e3d2
20 4月, 2022 15 次提交

L

update (#41762) (#41843) · 1e18b57b
由 lilong12 提交于 4月 20, 2022

1e18b57b
H
[Dygraph] Refactor Model Parallel in eager mode (#41761) (#41960) · 5ce7f48d
由 Haohongxiang 提交于 4月 20, 2022
```
* refactor mp in eager mode

* update

* update

* add uts
```
5ce7f48d

cherry pick recent updates in graph-engine to release2.3 (#42027) · ef78c9c2

由 seemingwang 提交于 4月 20, 2022

* gpu_graph engine optimization+ (#41455)

* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* recover test

* recover test

* fix spelling

* recover

* fix

* Cpu gpu graph engine (#41942)

* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* recover test

* recover test

* fix spelling

* recover

* fix

* fix linking problem

* remove comment

ef78c9c2

[cherry-pick] Cherry pick pr of new-exec (#42009) · 80992253

由 Leo Chen 提交于 4月 20, 2022

* [new-exec] shrink downstream map (#41471)

* shrink downstream map

* shrink last live ops of var

* add comment

* fix bug

* add dependency for send/recv to support pp parallel (#41652)

* [new-exec] clear the scope listener after run (#41947)

* clear the listener after run

* only sync variables in program

* refine code

* fit for lod_tensor_blocking_queue

80992253

X

[cherry-pick]fix StickBreakingTransform forward error when input rank is over 2 (#41940) (#41983) · 60c212d5
由 Xiaoxu Chen 提交于 4月 20, 2022

60c212d5
H
windows compile add onnxruntime switch (#41988) (#42015) · 23cc4636
由 heliqi 提交于 4月 20, 2022
```
windows编译脚本增加onnxruntime编译选项
```
23cc4636
N
[cherry-pick] Add AutoTune to reader.py for DataLoader (#42004) · a8ee07c8
由 niuliling123 提交于 4月 20, 2022
```
Add AutoTune to reader.py for DataLoader
```
a8ee07c8
F

fix:conflict (#41913) · 4ef0a0b7
由 feng_shuai 提交于 4月 20, 2022

4ef0a0b7

Cherry-pick PR41720, support no_need_buffer in eager_fluid state (#41720) (#41956) · 279d2db3

由 pangyoki 提交于 4月 20, 2022

* support no_need_buffer in eager_fluid state

* change no_need_buffer info from fwd_info to bwd_info

* fix CI fail, gru_unit donnot use no_need_buffer

* fix conflict between no_need_buffer and dispensable

* use tensor.define in dispensable

* solve conflict

* solve conflict

279d2db3

J
Fixed performance issue regarding BackwardRun using add_final_state_dygraph (#41912) (#41991) · 968bf46e
由 Jiabin Yang 提交于 4月 20, 2022
```
Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
```
968bf46e

[Phi] Support construct Scalar by using Non-CPU Tensor (#41765) (#41963) · 3b25afb2

由 YuanRisheng 提交于 4月 20, 2022

* support construct scalar using non-cpu tensor

* fix bugs when run unittest

* fix compile bugs

* fix bugs when run ci

* fix compile bugs

* fix bugs when move copy

* perfect unit test

* perfect unittest

* update according to comment

* add target dependency

* deal with conflict

* fix bugs when run unit test

* fix unit test bugs

3b25afb2

[Cherry-pick]fix bug for eager mode distributed training (#41975) · 9a75b4b9

由 Aurelius84 提交于 4月 20, 2022

* update (#41636)

* fix bug for eager mode distributed training (#41841)
Co-authored-by: Nlilong12 <lilong12@baidu.com>

9a75b4b9

[Cherry-Pick]Fix expand_sig infershape BUG under static graph mode and... · 93f0e594

由 Aurelius84 提交于 4月 20, 2022

[Cherry-Pick]Fix expand_sig infershape BUG under static graph mode and NeedTransformPlace behavior if set skip_transform in yaml (#41973)

* [Phi]Fix expand_sig infershape BUG under static graph mode (#41936)

* [Phi]Fix expand_sig infershape BUG under static graph mode

* [Phi]Fix expand_sig infershape BUG under static graph mode

* [Phi]Fix unittest

* [Phi]Fix unittest

* [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml (#41920)

* [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml

* add unittest for full_like

* fix unittest

93f0e594

[cherry-pick] Refine user experience for profiler (#41989) · 2ea5e02c

由 chenjian 提交于 4月 20, 2022

* fix divide zero error when cpu only (#41794)

* reduce performance influence by RecordEvent in Python (#41822)

* reduce performance influence

* add unit test

* fix

* Rebase for profiler statistic ratio (#41939)

* fix according to suggestion

* add kernel summary

* improve coverage

2ea5e02c

Z
[cherry-pick] Implement Amp Layout AutoTune(41884) (#41964) · 85a4ecb6
由 Zhang Ting 提交于 4月 20, 2022
```
 cherry-pick #41884 
```
85a4ecb6

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致