提交 · 5fef043dc0950c2884ad538303f8f56ee3b1c86f · BaiXuePrincess / Paddle

18 10月, 2022 4 次提交
- [cherry-pick 2.4] add sparse api transpose/reshape/is_same_shape (#47076) · 5fef043d
  由 zhouweiwei2014 提交于 10月 18, 2022
```
新增sparse.is_same_shape、sparse.reshape、sparse.transpose 三个API
```
  5fef043d
- Z
  
  support shape tensor is the input of trt-subgraph (#47066) · 5a44c124
  由 zhoutianzi666 提交于 10月 18, 2022
  
  5a44c124
- H
  [cherry-pick] Fix perf issues of mp/pp/fuse in eager mode (#47071) · b84edd90
  由 Haohongxiang 提交于 10月 18, 2022
```
* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116)

* [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780)

* update
```
  b84edd90
- W
  [Cherry pick] trt pool2d adaptive ifx (#47069) · 5f6b9f1b
  由 Wang Bojun 提交于 10月 18, 2022
```
* draft with debug print
* remove debug print
* bug fix for ci
```
  5f6b9f1b
17 10月, 2022 5 次提交

[Cherry-pick] Collective communication APIs (#46922) · 5fba2a98

由 Wen Sun 提交于 10月 17, 2022

* Support both use_calc_stream and sync_op in send recv APIs (#46023)

* Support both use_calc_stream and sync_op in allgather API (#46295)

* Support both use_calc_stream and sync_op in collective communication API (#46761)

* Move group and all reduce from collective to communication (#45848)

* Completes bfloat16 dtype for collective api in eager mode (#45844)

* Fix collective APIs cannot be recognized when building docs (#46962)
Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>

5fba2a98

Z
[cherry-pick]Sparse static graph (#46838) · 10225d22
由 zhangkaihuo 提交于 10月 17, 2022
```
cherry-pick : #46322, #46245
Sparse API 支持静态图
```
10225d22

Optimize performance of depthwise_conv (#46896) · 976af0da

由 Zhang Zheng 提交于 10月 17, 2022

Optimize performance of depthwise_conv

Config: input[2048, 1024, 4, 4], filter[1024, 1, 4, 4], stride=1, pad=0, dilation=1

976af0da

[IPU] paddle-inference support custom-ops (#45235) (#46868) · bd89be12

由 Allen Guo 提交于 10月 17, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

bd89be12

[Cherry-Pick]Move valid check from python to kernel (#46980) · 8bfd45ad

由 Zhang Zheng 提交于 10月 17, 2022

为了提升性能，将label的边界检查从python端转移到kernel内，减少额外op的调用，如min、max和同步拷贝等
    当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效，但是当某个label值超出了边界，ignore_index等于该label，这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错，但逻辑上仍是有问题的，且模板参数IgnoreIndex是没有必要的

8bfd45ad

14 10月, 2022 5 次提交
- W
  
  cherry-pick 46942 (#47015) · 82db4993
  由 Wilber 提交于 10月 14, 2022
  
  82db4993
- X
  
  Add bmm convert (#47011) · 8f1ac7cf
  由 xiaoxiaohehe001 提交于 10月 14, 2022
  
  8f1ac7cf
- A
  [BUG]Fix expand_as_v2 bug while X and Y with different dtype (#46950) (#46999) · 4b472656
  由 Aurelius84 提交于 10月 14, 2022
```
* [BUG]Fix expand_as_v2 bug while X and Y with different dtype

* fix commit
```
  4b472656
- Z
  [cherry-pick 2.4][inference] fix reshape2 opteller (#46871) · 535d7574
  由 Zhang Jun 提交于 10月 14, 2022
```
* fix reshape2 opteller;
add elementwise min/max register for tensorrt
```
  535d7574
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) (#46979) · b8677c0d
  由 zhoutianzi666 提交于 10月 14, 2022
  
  b8677c0d
13 10月, 2022 3 次提交

Z

interpretercore thread not always spin (#46687) (#46952) · d90aaa6e
由 zhangbo9674 提交于 10月 13, 2022

d90aaa6e
傅
[Cherry-pick] Add fp16 dtype support for set_value op (#46906) · 100a0750
由傅剑寒提交于 10月 13, 2022
```
Fix set_value failure when source tensor is fp16 Dtype and destiny value is a number
(dev PR link:#46801)
```
100a0750

[cherry-pick] [PHI] transpose2_grad op migration (#46139) (#46873) · 0280c0b9

由 Sławomir Siwek 提交于 10月 13, 2022

* Revert pool+grad oneDNN kernel conversion (#45989)

* [PHI] transpose2_grad op migration (#46139)

* op migrated, Copy(OneDNNContext, ...) added

* mutable_data & op registration in fluid removed

* refactoring

* OneDNNGetDataType to uppercase

* missing cpu check added, handler moved to .h file

* name changed to transpose_grad

* Copy changed back to TensorCopy

* Resizing corrected, Copy(OneDNNContext) removed
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>

0280c0b9

12 10月, 2022 1 次提交
- N
  [Cherry-pick]Update layout autotune for module with no modified (#46541) (#46515) (#46880) · 61273c0e
  由 niuliling123 提交于 10月 12, 2022
```
Cherry-pick 46541
保证Reset50 TSM deeplabv3模型零修改下实现Layout自动调优
```
  61273c0e
11 10月, 2022 8 次提交
- F
  
  set_value_op: add support for complex types (#46885) · b051455f
  由 Feiyu Chan 提交于 10月 11, 2022
  
  b051455f
- S
  
  add seed check (#46858) · 2190da20
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2190da20
- S
  
  hard_swish grad (#46857) · 2c6bd4ad
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2c6bd4ad
- S
  [cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0
  由 Sławomir Siwek 提交于 10月 11, 2022
```
* [PHI] Migrate gelu kernels (#45596)

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* gelu fwd

* sort activations

* gelu gradient

* remove unused macros

* merge conflicts

* fix merge conflicts

* remove extra contraint from gelu op

* [PHI] relu6_grad kernel (#46501)

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style
```
  2bcbf8b0
- S
  Revert pool+grad oneDNN kernel conversion (#45989) (#46860) · 7b3837e6
  由 Sławomir Siwek 提交于 10月 11, 2022
```
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
```
  7b3837e6
- C
  
  speedup ChannelClipAndQuantDequantKernelQuantAxis1 kernel (#46471) (#46551) · f5565494
  由 ceci3 提交于 10月 11, 2022
  
  f5565494
- Y
  [BugFix]Fix concat bugs when call onednn kernel (#46518) (#46845) · 6a6c7493
  由 YuanRisheng 提交于 10月 11, 2022
```
* fix concat bug

* fix ci bugs

* fix ci bugs
```
  6a6c7493
- Y
  
  optimize Paddle-TRT performance (#46684) · d091d1b0
  由 Yuanle Liu 提交于 10月 11, 2022
  
  d091d1b0
10 10月, 2022 6 次提交

F
Fix gather op convert for Paddle-TensorRT (#46779) (#46825) · a0e03418
由 feng_shuai 提交于 10月 10, 2022
```
* fix gather op convert to only support int32 index as input.
* add ut
```
a0e03418

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN... · fdd0d6d0

由 Sławomir Siwek 提交于 10月 10, 2022

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN kernels (#45863) (#46727)

* [PHI] Migrate concat+grad, expand+grad, fill_constant, nearest_interp and bilinear_interp oneDNN kernels (#45863)

* Migrate concat+grad, expand+grad, fill_constant, nearest_interp_v2 and bilinear_interp_v2 oneDNN kernels to PHI

* Remove old namespace variable

* Fix invalid out dims error

* Add mutable_data method to concat output

* Add check for -1 dim before computing out_dims

* Capitalize oneDNNGetDataType function name

* Change fill_constant kernel to correct PHI kernel

* Attempt to fix dims error

* Fix fill_constant (full) kernel

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

fdd0d6d0

[cherry-pick] [PHI] Migrate sgd and stack oneDNN kernels (#46374) (#46729) · 25d61cd1

由 Sławomir Siwek 提交于 10月 10, 2022

* [PHI] Migrate sgd and stack oneDNN kernels (#46374)

* Convert slice+grad oneDNN fluid kernels to PHI

* Change mutable_data to Alloc

* Refactor licences

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

25d61cd1

[PHI] Migrate slice, slice_grad, split, pad and pad3d oneDNN kernels (#46101) (#46726) · 51a91fee

由 Sławomir Siwek 提交于 10月 10, 2022

* Convert split, pad and pad3d kernels

* Convert slice+grad oneDNN fluid kernels to PHI

* change out->mutable_data to dev_ctx.Alloc
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

51a91fee

S
[PHI] migrate softmax_grad kernel (#46257) (#46725) · 44ecae6c
由 Sławomir Siwek 提交于 10月 10, 2022
```
* init

* remove softmaxop

* merge dev

* correct dir

* style
```
44ecae6c

[PHI] Shape op migration (#46051) (#46724) · 3cc3f60f

由 Sławomir Siwek 提交于 10月 10, 2022

* First approach

* Shape kernel corrected

* Compilation error fixed

* Resize corrected

* Registered types added

* Mistake corrected & types added

* sum kernel deleted
Co-authored-by: NPaulina Gacek <paulina.gacek.pl@gmail.com>

3cc3f60f

29 9月, 2022 4 次提交

傅
[cherry-pick] Add FP16 support for uniform in dygraph mode on Nvidia GPU (#46641) · a58663f3
由傅剑寒提交于 9月 29, 2022
```
Add FP16 support for uniform in dygraph mode on Nvidia GPU
Dev PR link PR46212
```
a58663f3

[cherry-pick] Open the clip_extra flag in save_inference_model (#46577) · d67da3dc

由 zyfncg 提交于 9月 29, 2022

* set flag of clip_extra in save_inference_model to true (#46151)

* open the clip_extra flag in paddle.static.save_inference_model, test=allcase (#46456)

* Open the clip_extra flag in TracedLayer.save_inference_model (#46473)

* open the clip_extra flag in paddle.static.save_inference_model, test=allcase

* set the defalut value of clip_extra in TracedLayer from False to True, test=allcase

* update english doc of paddle.static.save_inference_model, test=document_fix (#46484)

* Fix clip_extra logic in remove_training_info (#46534)

* fix clip_extra code in remove_training_info

* revert rnn opmaker clear

d67da3dc

W

Fix the half precision problem of general plugin (#46580) · d90db9bd
由 weishengying 提交于 9月 29, 2022

d90db9bd
L
[CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and... · f5956bec
由 Lin Manhui 提交于 9月 29, 2022
```
[CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and InverseFloorDivideFunctor (#45051) (#46504)
```
f5956bec

28 9月, 2022 4 次提交

Z

refine dy2st glog (#46415) (#46438) · 3f35e634
由 zhangbo9674 提交于 9月 28, 2022

3f35e634

Fix libpaddle soname mismatch error (#46344) (#46576) · 1c22ed7f

由 Chen Weihang 提交于 9月 28, 2022

* fix libpaddle soname mismatch error

* fix windows failed

* polish linux and windows make impl

* unify winddows lib name

* fix windows error

* revert copy dst change

* revert naming change

* revert windows change

* fix gpups compile failed

1c22ed7f

[cherry-pick] Clear extra attrs of some ops in OpMaker (#46150, #46321,... · b2e4211d

由 zyfncg 提交于 9月 28, 2022

[cherry-pick] Clear extra attrs of some ops in OpMaker (#46150, #46321, #46418, #46451, #46457) (#46553)

* Clear extra attributes of some Op in OpMaker (Part4) (#46060)

* clear extra attr of some ops in opmaker

* revert clear use_cudnn for pool

* fix test_operator_desc

* fix Attr interface of OperatorBase

* clear extra attrs of condition op in opmaker (#46150)

* Clear extra attrs of lookup_table_v2 in OpMaker (#46321)

* clear extra attrs of look_up_table_v2 in opmaker

* fix bug

* clear extra attrs of quantize op in opmaker (#46418)

* delete repeated item

* clear extra attrs of distribute op in opmaker (#46451)

* clear extra atts of sequence_softmax in opmaker (#46457)

b2e4211d

Z

remove trt_reshape2_matmul_fuse_pass (#46363) · a77a6f6b
由 zhoutianzi666 提交于 9月 28, 2022

a77a6f6b

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致