提交 · 5fba2a981a1a02e33b77285761e234f2b26c244c · BaiXuePrincess / Paddle

17 10月, 2022 9 次提交

[Cherry-pick] Collective communication APIs (#46922) · 5fba2a98

由 Wen Sun 提交于 10月 17, 2022

* Support both use_calc_stream and sync_op in send recv APIs (#46023)

* Support both use_calc_stream and sync_op in allgather API (#46295)

* Support both use_calc_stream and sync_op in collective communication API (#46761)

* Move group and all reduce from collective to communication (#45848)

* Completes bfloat16 dtype for collective api in eager mode (#45844)

* Fix collective APIs cannot be recognized when building docs (#46962)
Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>

5fba2a98

Z
[cherry-pick]Sparse static graph (#46838) · 10225d22
由 zhangkaihuo 提交于 10月 17, 2022
```
cherry-pick : #46322, #46245
Sparse API 支持静态图
```
10225d22

Optimize performance of depthwise_conv (#46896) · 976af0da

由 Zhang Zheng 提交于 10月 17, 2022

Optimize performance of depthwise_conv

Config: input[2048, 1024, 4, 4], filter[1024, 1, 4, 4], stride=1, pad=0, dilation=1

976af0da

G
[cherry-pick] Fix the bug of exporting model in dygraph QAT (#47028) · 7eef05c2
由 Guanghua Yu 提交于 10月 17, 2022
```
* fix dygraph new format quant
* fix unittest
* fix conflict
```
7eef05c2
A

update to sdk3.0 (#46865) (#46892) · 8c6c79ac
由 Allen Guo 提交于 10月 17, 2022

8c6c79ac
A

fix ut timeout 2 (#45233) (#46867) · d913bc98
由 Allen Guo 提交于 10月 17, 2022

d913bc98

[IPU] paddle-inference support custom-ops (#45235) (#46868) · bd89be12

由 Allen Guo 提交于 10月 17, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

bd89be12

A

rm fp16 dtype_check (#46739) (#46866) · a1cdbad1
由 Allen Guo 提交于 10月 17, 2022

a1cdbad1

[Cherry-Pick]Move valid check from python to kernel (#46980) · 8bfd45ad

由 Zhang Zheng 提交于 10月 17, 2022

为了提升性能，将label的边界检查从python端转移到kernel内，减少额外op的调用，如min、max和同步拷贝等
    当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效，但是当某个label值超出了边界，ignore_index等于该label，这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错，但逻辑上仍是有问题的，且模板参数IgnoreIndex是没有必要的

8bfd45ad

14 10月, 2022 8 次提交
- X
  
  Fix nvcc lazy (#46879) · 5c2bea17
  由 xiaoxiaohehe001 提交于 10月 14, 2022
  
  5c2bea17
- W
  
  cherry-pick 46942 (#47015) · 82db4993
  由 Wilber 提交于 10月 14, 2022
  
  82db4993
- G
  
  update quantization new format (#46529) · 84333cf5
  由 Guanghua Yu 提交于 10月 14, 2022
  
  84333cf5
- X
  
  Add bmm convert (#47011) · 8f1ac7cf
  由 xiaoxiaohehe001 提交于 10月 14, 2022
  
  8f1ac7cf
- A
  
  [Dy2St]Remove usless cast operation to speed up FP16 training (#46851) (#46998) · 27444326
  由 Aurelius84 提交于 10月 14, 2022
  
  27444326
- A
  [BUG]Fix expand_as_v2 bug while X and Y with different dtype (#46950) (#46999) · 4b472656
  由 Aurelius84 提交于 10月 14, 2022
```
* [BUG]Fix expand_as_v2 bug while X and Y with different dtype

* fix commit
```
  4b472656
- Z
  [cherry-pick 2.4][inference] fix reshape2 opteller (#46871) · 535d7574
  由 Zhang Jun 提交于 10月 14, 2022
```
* fix reshape2 opteller;
add elementwise min/max register for tensorrt
```
  535d7574
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) (#46979) · b8677c0d
  由 zhoutianzi666 提交于 10月 14, 2022
  
  b8677c0d
13 10月, 2022 3 次提交

Z

interpretercore thread not always spin (#46687) (#46952) · d90aaa6e
由 zhangbo9674 提交于 10月 13, 2022

d90aaa6e
傅
[Cherry-pick] Add fp16 dtype support for set_value op (#46906) · 100a0750
由傅剑寒提交于 10月 13, 2022
```
Fix set_value failure when source tensor is fp16 Dtype and destiny value is a number
(dev PR link:#46801)
```
100a0750

[cherry-pick] [PHI] transpose2_grad op migration (#46139) (#46873) · 0280c0b9

由 Sławomir Siwek 提交于 10月 13, 2022

* Revert pool+grad oneDNN kernel conversion (#45989)

* [PHI] transpose2_grad op migration (#46139)

* op migrated, Copy(OneDNNContext, ...) added

* mutable_data & op registration in fluid removed

* refactoring

* OneDNNGetDataType to uppercase

* missing cpu check added, handler moved to .h file

* name changed to transpose_grad

* Copy changed back to TensorCopy

* Resizing corrected, Copy(OneDNNContext) removed
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>

0280c0b9

12 10月, 2022 2 次提交
- N
  [Cherry-pick]Update layout autotune for module with no modified (#46541) (#46515) (#46880) · 61273c0e
  由 niuliling123 提交于 10月 12, 2022
```
Cherry-pick 46541
保证Reset50 TSM deeplabv3模型零修改下实现Layout自动调优
```
  61273c0e
- R
  cherry pick pr46536 (#46901) · 08d233f9
  由 ronnywang 提交于 10月 12, 2022
```
cherry pick pr46536 
```
  08d233f9
11 10月, 2022 9 次提交
- F
  
  set_value_op: add support for complex types (#46885) · b051455f
  由 Feiyu Chan 提交于 10月 11, 2022
  
  b051455f
- S
  
  add seed check (#46858) · 2190da20
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2190da20
- S
  
  hard_swish grad (#46857) · 2c6bd4ad
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2c6bd4ad
- S
  [cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0
  由 Sławomir Siwek 提交于 10月 11, 2022
```
* [PHI] Migrate gelu kernels (#45596)

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* gelu fwd

* sort activations

* gelu gradient

* remove unused macros

* merge conflicts

* fix merge conflicts

* remove extra contraint from gelu op

* [PHI] relu6_grad kernel (#46501)

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style
```
  2bcbf8b0
- S
  Revert pool+grad oneDNN kernel conversion (#45989) (#46860) · 7b3837e6
  由 Sławomir Siwek 提交于 10月 11, 2022
```
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
```
  7b3837e6
- C
  
  speedup ChannelClipAndQuantDequantKernelQuantAxis1 kernel (#46471) (#46551) · f5565494
  由 ceci3 提交于 10月 11, 2022
  
  f5565494
- Y
  Cherry pick for dygraph pp (#46876) · 9cc3f69f
  由 Yuang Liu 提交于 10月 11, 2022
```
* bug fix for virtual pipeline parallel (#45922)

* dont wait for send op under dygraph pp (#46209)

* [interleave pp] sync recv for 1f1b (#46399)

* [dygraph pp] all sync for allgather partial (#46483)
```
  9cc3f69f
- Y
  [BugFix]Fix concat bugs when call onednn kernel (#46518) (#46845) · 6a6c7493
  由 YuanRisheng 提交于 10月 11, 2022
```
* fix concat bug

* fix ci bugs

* fix ci bugs
```
  6a6c7493
- Y
  
  optimize Paddle-TRT performance (#46684) · d091d1b0
  由 Yuanle Liu 提交于 10月 11, 2022
  
  d091d1b0
10 10月, 2022 7 次提交

F
Fix gather op convert for Paddle-TensorRT (#46779) (#46825) · a0e03418
由 feng_shuai 提交于 10月 10, 2022
```
* fix gather op convert to only support int32 index as input.
* add ut
```
a0e03418
A

[Dy2St]Fix Regex DeprecationWarning in PY3 (#46829) · d8daf64e
由 Aurelius84 提交于 10月 10, 2022

d8daf64e

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN... · fdd0d6d0

由 Sławomir Siwek 提交于 10月 10, 2022

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN kernels (#45863) (#46727)

* [PHI] Migrate concat+grad, expand+grad, fill_constant, nearest_interp and bilinear_interp oneDNN kernels (#45863)

* Migrate concat+grad, expand+grad, fill_constant, nearest_interp_v2 and bilinear_interp_v2 oneDNN kernels to PHI

* Remove old namespace variable

* Fix invalid out dims error

* Add mutable_data method to concat output

* Add check for -1 dim before computing out_dims

* Capitalize oneDNNGetDataType function name

* Change fill_constant kernel to correct PHI kernel

* Attempt to fix dims error

* Fix fill_constant (full) kernel

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

fdd0d6d0

[cherry-pick] [PHI] Migrate sgd and stack oneDNN kernels (#46374) (#46729) · 25d61cd1

由 Sławomir Siwek 提交于 10月 10, 2022

* [PHI] Migrate sgd and stack oneDNN kernels (#46374)

* Convert slice+grad oneDNN fluid kernels to PHI

* Change mutable_data to Alloc

* Refactor licences

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

25d61cd1

[PHI] Migrate slice, slice_grad, split, pad and pad3d oneDNN kernels (#46101) (#46726) · 51a91fee

由 Sławomir Siwek 提交于 10月 10, 2022

* Convert split, pad and pad3d kernels

* Convert slice+grad oneDNN fluid kernels to PHI

* change out->mutable_data to dev_ctx.Alloc
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

51a91fee

S
[PHI] migrate softmax_grad kernel (#46257) (#46725) · 44ecae6c
由 Sławomir Siwek 提交于 10月 10, 2022
```
* init

* remove softmaxop

* merge dev

* correct dir

* style
```
44ecae6c

[PHI] Shape op migration (#46051) (#46724) · 3cc3f60f

由 Sławomir Siwek 提交于 10月 10, 2022

* First approach

* Shape kernel corrected

* Compilation error fixed

* Resize corrected

* Registered types added

* Mistake corrected & types added

* sum kernel deleted
Co-authored-by: NPaulina Gacek <paulina.gacek.pl@gmail.com>

3cc3f60f

09 10月, 2022 1 次提交

[Dy2Static] refactor the return transformer (#45900) (#46205) · 4282af69

由 xiongkun 提交于 10月 09, 2022

* 1. refactor the return transformer.
2. fix some bugs in return transformer.

* support raise error while return stmt's father is For or while

* fix ci error.

* fix ci error and add some unittest

* code format

* fix ci error

4282af69

29 9月, 2022 1 次提交
- 傅
  [cherry-pick] Add FP16 support for uniform in dygraph mode on Nvidia GPU (#46641) · a58663f3
  由傅剑寒提交于 9月 29, 2022
```
Add FP16 support for uniform in dygraph mode on Nvidia GPU
Dev PR link PR46212
```
  a58663f3

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致