提交 · e5bd7eb82eca1eeb83a742e48eea0dd1d284fbab · 机器未来 / Paddle

16 6月, 2021 1 次提交
- S
  Add trt layer norm dynamic (#33448) · e5bd7eb8
  由 Shang Zhizhou 提交于 6月 16, 2021
```
* 1, remove layernorm dynamic fp16; 2, let reshape out in dynamic shape (#33535)
```
  e5bd7eb8
15 6月, 2021 4 次提交

W

Cherry-pick support the bool tensor for the compare ops (#33551) · c334d2bd
由 wawltor 提交于 6月 15, 2021

c334d2bd

[cherry-pick] fix gather bug && fix hang of new_group (#33553) · a4e841e0

由 ShenLiang 提交于 6月 15, 2021

* Fix gather infer shape using axis (#33413)

* fix gather shape bug

* fix None

* fix topo

* Fix hang of hybrid parallel in new_group  (#33141)

* fix hang of hybrid parallel

* fix new_group for hang problem

* fix hang

a4e841e0

[Cherry-Pick] Fix the segfault when using to_tensor in PyLayer. (#33303) (#33518) · 0079e0b1

由 WeiXin 提交于 6月 15, 2021

修复pylayer 返回to_tensor时触发段错误的bug。
原因：

如果在Python端修改了stop_gradient属性，c++ 端InnerSetOverridedStopGradient 无法修改stop_gradient属性，在c++端调用SetOverridedStopGradient修改stop_gradient属性。
to_tensor产生的tensor的grad var的DataType为默认值（-1），在backward的过程中grad var的DataType不能为默认值（-1），因此在调用ForwardDataType设置grad var的DataType。

原始PR：#33303

0079e0b1

W

refix if-else logic for inference: missing if (#33531) · f7034613
由 wenbin 提交于 6月 15, 2021

f7034613

12 6月, 2021 1 次提交

Fix LayerNorm Problem Release2.1 (#33534) · a43e1fac

由 zhiboniu 提交于 6月 12, 2021

* Eliminate numerical differences of LayerNorm; fix LayerNorm Nan Bug while large data input

* fix bug while large shape of data input

a43e1fac

11 6月, 2021 3 次提交
- L
  [cherry-pick 2.1.1]2.1/fix concat (#33383) · 9567cbd7
  由 liuyuhui 提交于 6月 11, 2021
```
* add unit8 for concat (#32850)

* add bool type for tril api (#33402)
```
  9567cbd7
- C
  [Cherry-pick] Support diff dataset tensor place in single process dataloader (#33470) (#33487) · 14440905
  由 Chen Weihang 提交于 6月 11, 2021
```
Support diff dataset tensor place in single process dataloader

cherry-pick of #33470
```
  14440905
- L
  [cherry-pick]Fixed a bug of log_softmax: op input was modified to 'nan' (#32937) (#33436) · 61cae0df
  由 Lijunhui 提交于 6月 11, 2021
```
使用op benchmark时发现，当输入数据量小于某个值时，python 端 log_softmax 接口的输入值经过计算过后 会被改变为nan。输出正常。

cherry-pick自 #32937
```
  61cae0df
10 6月, 2021 2 次提交
- W
  
  fix aligned in roi_align (#33446) · 03f46685
  由 wangguanzhong 提交于 6月 10, 2021
  
  03f46685
- 王
  
  fix the bug in repeated_fc_relu_fuse_pass.test=develop (#33386) (#33431) · c4a417f5
  由王明冬提交于 6月 10, 2021
  
  c4a417f5
09 6月, 2021 2 次提交
- fix the bug of yolo_box which can't run on nano and tx2 (#33422) (#33442) · d4967224
  由 s.feng 提交于 6月 09, 2021
  
  d4967224
- W
  
  [Paddle-TRT] Add gather_nd and reduce_sum trt op. (#33324) (#33365) · 6385f5ee
  由 Wilber 提交于 6月 09, 2021
  
  6385f5ee
08 6月, 2021 3 次提交
- W
  
  Add trt convert reshape_op in release/2.1.1 (#33372) · bad3bebf
  由 Wangzheee 提交于 6月 08, 2021
  
  bad3bebf
- P
  Cherry pick deconv & jetson single arch (#33387) · 0549d4af
  由 Pei Yang 提交于 6月 08, 2021
```
* fix conv2d_transpose trt bugs (#33242)

* fix jetson arch when compiling with single arch (#33269)
```
  0549d4af
- T
  OP:strided_slice_op supports bool type inputs (#33373) (#33393) · ccabafa6
  由 TeslaZhao 提交于 6月 08, 2021
```
* Fix two english api documents, transpose and strided_slice

* OP:strided_slice_op supports bool type inputs
```
  ccabafa6
07 6月, 2021 1 次提交
- W
  
  Fix inference prepare data (#33370) · d5225145
  由 wenbin 提交于 6月 07, 2021
  
  d5225145
04 6月, 2021 1 次提交
- W
  [CherryPick] fix compare ops when broadcast (#33086) · c42ccf14
  由 wawltor 提交于 6月 04, 2021
```
* fix compare op in for in the cuda device

* fix the paddle compare op for the broadcast
```
  c42ccf14
03 6月, 2021 2 次提交
- Q
  
  [ROCM] update paddle inference cmake, test=develop (#33260) (#33290) · b032b579
  由 Qi Li 提交于 6月 03, 2021
  
  b032b579
- Q
  
  [ROCM] fix fused_fc_elementwise_layernorm, test=develop (#33281) (#33299) · ef6120f3
  由 Qi Li 提交于 6月 03, 2021
  
  ef6120f3
01 6月, 2021 1 次提交
- W
  
  Fix cuda kernel launch of grid sampler (#33100) (#33232) · 8a5a45f8
  由 whs 提交于 6月 01, 2021
  
  8a5a45f8
31 5月, 2021 1 次提交
- W
  
  disable conv plugin in TRT old versions (#33198) · 7766721a
  由 wenbin 提交于 5月 31, 2021
  
  7766721a
25 5月, 2021 1 次提交
- S
  [HybridParallel]Fix precision problem of model parallel (#32897) (#33087) · 4026e227
  由 ShenLiang 提交于 5月 25, 2021
```
* fix precision of mp

* fix bug of seed

* fix dp

* print group
```
  4026e227
20 5月, 2021 1 次提交
- A
  BugFix with ParseInputDataType from LodTensorArray (#32918) (#32984) · 8ecaa8a5
  由 Aurelius84 提交于 5月 20, 2021
```
* BugFix with ParseInputDataType from LodTensorArray

* BugFix with ParseInputDataType from LodTensorArray
```
  8ecaa8a5
19 5月, 2021 2 次提交

C
[Cherry-pick] add enforce check for set_value (#32972) (#32981) · b4b9438a
由 Chen Weihang 提交于 5月 19, 2021
```
cherry-pick of #32972
```
b4b9438a

【cherrypick】support cuda11 for heterps; add profiler in oneps (#32957) · ab1a4df9

由 danleifeng 提交于 5月 19, 2021

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

ab1a4df9

18 5月, 2021 1 次提交
- H
  
  bugfix: parallel_executor for xpu should use BindThreadedSSAGraphExecutor (#32792) (#32933) · b619648c
  由 houj04 提交于 5月 18, 2021
  
  b619648c
11 5月, 2021 1 次提交
- S
  fix find_unused_parameters default value (#32829) · 02513207
  由 ShenLiang 提交于 5月 11, 2021
```
fix error log for reducer

fix doc

fix bug of utest

fix spawn

fix converage
```
  02513207
07 5月, 2021 6 次提交

[Paddle-TRT] Implement MHA fp16 order same as training (#32629) (#32785) · 09b18a49

由 Shang Zhizhou 提交于 5月 07, 2021

* implement MHA order same as training

* fix fp16 compile issue on old architecture
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>

09b18a49

J

fix stack grad gpu (#32781) · f54fb1ee
由 Jiawei Wang 提交于 5月 07, 2021

f54fb1ee
L
[Cherrypick 2.1] fix compile error on jetson platform (#32760) · ded39f84
由 LielinJiang 提交于 5月 07, 2021
```
* fix compile error on jetson platform

* remove unused head file

* rm decode_jpeg op on jetson platform
```
ded39f84

[CHERRY-PICK2.1]Remove paddle_custom_op dynamic libraries, and link to... · 3ba8c48a

由 Zhou Wei 提交于 5月 07, 2021

 [CHERRY-PICK2.1]Remove paddle_custom_op dynamic libraries, and link to FLUID_CORE on windows (#32583) (#32769)

* Remove paddle_custom_op dynamic libraries, change link to FLUID_CORE on windows, and check copy_to

* fix CI

3ba8c48a

W
pylayer_op:release context after compute. (#32707) (#32744) · c67a5d98
由 WeiXin 提交于 5月 07, 2021
```
修复了py_layer_op由于没有析构PyLayerContext造成内存(显存)泄露的问题。

原始pr：#32707
```
c67a5d98

[Cherry-Pick] Clear 'BasicEngine' when an exception occurs in the backward. (#32546) (#32615) · 7e35ef3a

由 WeiXin 提交于 5月 07, 2021

* clear 'BasicEngine' when an exception occurs in the backward. (#32546)

* clear 'BasicEngine' when an exception occurs in the backward.

* deal with conflict.

* deal with conflict.

* forward return any type. (#32661)

7e35ef3a

06 5月, 2021 5 次提交
- A
  
  [cherry-pick] Sum kernel for CPU supporting BF16 and SelectedRows (#32631) (#32755) · f3436af1
  由 Adam Osewski 提交于 5月 06, 2021
  
  f3436af1
- J
  [CHERRY-PICK] Reduce grad fix cherrypick (#32742) · 21448525
  由 jakpiase 提交于 5月 06, 2021
```
* base changes for fix

* minor change

* fix for bwd kernel

* removed unnecessary import

* implemented reviewers suggestions

* CI fix
```
  21448525
- C
  cherry-pick:change softmax_with_cross_entropy_op's parameter name from... · 9a589de8
  由 chajchaj 提交于 5月 06, 2021
```
cherry-pick:change softmax_with_cross_entropy_op's parameter name from softmax_switch to use_softmax (#32750)

* change parameter name from softmax_switch to use_softmax, test=develop

* cherry-pick:change parameter name from softmax_switch to use_softmax, test=develop
```
  9a589de8
- L
  
  update, test=develop (#32731) · df00636b
  由 lilong12 提交于 5月 06, 2021
  
  df00636b
- Z
  add API Tensor.item() to convert Tensor element to a Python scalar (#32634) · 035c7425
  由 Zhou Wei 提交于 5月 06, 2021
```
cherry-pick #32561
```
  035c7425
05 5月, 2021 1 次提交
- S
  
  fix traverse graph in reducer (#32721) · 4626afa4
  由 ShenLiang 提交于 5月 05, 2021
  
  4626afa4

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致