提交 · 11c2874e6674296d8db59b02651bd711ee03f2c4 · Crayon鑫 / Paddle

28 10月, 2021 8 次提交
- L
  [fix-doc-bug] Fix fused_attention_op english doc test=document_fix (#36803) · 11c2874e
  由 Li Min 提交于 10月 28, 2021
```
* Fix fused_attention english doc test=document_fix
```
  11c2874e
- H
  ctc grad compute on gpu (#36756) · 54ef9d06
  由 Hui Zhang 提交于 10月 28, 2021
```
* Revert "Align CTC grad scale same with ESPNet (#34729)"

This reverts commit 10f9644c.

* ctc grad compute on gpu
```
  54ef9d06
- W
  save/load in ps runtime(the_one_ps) (#36097) · e7842ba6
  由 wangguanqun 提交于 10月 28, 2021
```
* add trainer desc config to distributed strategy

* code style modified

* data_feed set lod

* fix bug

* code style

* fix bug

* save load

* save load

* save unittest

* add unittest of the_one_ps

* unittest

* add todo in communicator sendsparse
```
  e7842ba6
- S
  
  fix MultiSlotDataGenerator error (#36773) · dc0178ef
  由 seemingwang 提交于 10月 28, 2021
  
  dc0178ef
- B
  
  Add lazy distributed launch with rank mapping (#36570) · 7de3f81c
  由 Bo Liu 提交于 10月 28, 2021
  
  7de3f81c
- L
  Fix fused_attention_op and fused_feedforward_op bug when pre_layer_norm is false. (#36793) · ff3018d7
  由 Li Min 提交于 10月 28, 2021
```
* Fix bug when pre_layer_norm is false.
```
  ff3018d7
- L
  fix device docs;test=document_fix (#36784) · d4b0d03b
  由 Ligoml 提交于 10月 28, 2021
```
* fix device docs;test=document_fix

* update __init__.py
```
  d4b0d03b
- L
  
  first commit (#36778) · 6edbdbfa
  由 limingshu 提交于 10月 28, 2021
  
  6edbdbfa
27 10月, 2021 18 次提交

P

add unittest (#36511) · 51a33962
由 pangyoki 提交于 10月 27, 2021

51a33962
Q
[ROCM] add custom op support, test=develop (#36771) · dd1d3789
由 Qi Li 提交于 10月 27, 2021
```
* [ROCM] add custom op support, test=develop

* remove debug codes, test=develop
```
dd1d3789
W
GeneratePass support attr condition and mapping (#36747) · 5c569aef
由 wuhuanzhou 提交于 10月 27, 2021
```
* GeneratePass support attr condition and mapping, test=develop

* fix coverage, test=develop
```
5c569aef

add paddle.version.cuda and paddle.version.cudnn API (#36556) · d65f41db

由 pangyoki 提交于 10月 27, 2021

* add paddle.version.cuda and paddle.version.cudnn API

* fix little bug

* fix bug

* add doc string

* fix mkdir error

* fix windows path

* fix new paddle/version path

* fix unittest

* fix format

d65f41db

Z

fix dygraph adamw (#36745) · b42a7370
由 zhaoyingli 提交于 10月 27, 2021

b42a7370
W
add dcnv2 trt plugin (#36612) · 8c3decd8
由 wangxinxin08 提交于 10月 27, 2021
```
* add dcnv2 plugin
```
8c3decd8
J
[Auto Parallel] Completion Dist Attribute for Backward & Update stage (#36744) · 5e9845b8
由 JZ-LIANG 提交于 10月 27, 2021
```
* revise completion for backward

* revise completion for update

* revise completion for update

* update unitest
```
5e9845b8

Added fp32 / bf16 forward and backward elementwise_div_mkldnn operator (#36158) · e92e6b06

由 piotrekobiIntel 提交于 10月 27, 2021

* Add WIP version of elementwise_div_mkldnn without working dy grad

* Add dy gradient calculation implementation, disable broadcast tests

* Readd removed tests from static_mode_white_list

* Add bfloat16 gradient tests, remove int8 and uint8 support

* - Change the way dy grad is calculated to improve performance
- Refactor BinaryMKLDNNHandler to use a default parameter

* Change copyright year

* Refactor as suggested

* Attempt to bypass CI Approval
not accepting max_relative_error

* Fix formatting issue

e92e6b06

delete extra clear_model (#36656) · 9a1cc609

由 xiaoxiao-luomu 提交于 10月 27, 2021

* gloo hdfs set check & gloo connect retry

* add vlog

* print gloo connect addr & add vlog

* .

* modify vlof

* modify vlog

* modify vlog

* Update __init__.py

deleted extra clear_model

9a1cc609

F
[PaddlePaddle Hackathon] add DenseNet (#36069) · c09fe142
由 fuqianya 提交于 10月 27, 2021
```
* add DenseNet
```
c09fe142
H
[BUGFIX] Add return self for nn.Layer(#36609) · facf6020
由 Hui Zhang 提交于 10月 27, 2021
```
* Layer.to reutrn self

* add device required
```
facf6020

Fused transformer encoder layer and fused feedforward layer (#36604) · 9f3613f3

由 zhangkaihuo 提交于 10月 27, 2021

本PR是fused_transformer的layer层代码，包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。

9f3613f3

X
bugfix: only check backend when mode == Collecive (#36758) · e6253152
由 xiongkun 提交于 10月 27, 2021
```
* bugfix: only check backend when mode == Collecive

* fix bug
```
e6253152

fix fftshift/ifftshift on static mode (#36748) · 34b6860e

由 Feiyu Chan 提交于 10月 27, 2021

* fix fftshift/ifftshift on static mode
* update roll_op version
* add more test cases for fftshift/ifftshift

34b6860e

T

add fp16 unittests for kl2 (#36583) · 6838a187
由 taixiurong 提交于 10月 27, 2021

6838a187
W

enable trt test check and fix trt ut error（3/3） (#36581) · 8c1c72af
由 Wilber 提交于 10月 27, 2021

8c1c72af

add paddle.linalg.eigvalsh API (#35615) · 9f9ed3ae

由 huangjun12 提交于 10月 27, 2021

* add eigvalsh with is_test

* add eigvalsh op

* fix backward bug

* forward and backward, float and complex, unittest

* remove eigvalsh_helper.h

* remove changes of cusolver.h

* fix unittest

* fix unittest bug

* update code following eigh

* fix test

* update lapack

* pull develop

* update funcor

* fix unittest bug

* fix details

* add tensor_method_func

* fix notes

9f9ed3ae

0

show paddle traceback after last user code traceback (#36741) · 63f3ae07
由 0x45f 提交于 10月 27, 2021

63f3ae07

26 10月, 2021 13 次提交

Remove additional warnning in layer.to (#36700) · 63f1e6bd

由 Jiabin Yang 提交于 10月 26, 2021

* remove additional warnning in layer.to

* remove additional warnning in layer.to

* remove additional warnning in layer.to

* remove additional warnning in layer.to

* remove additional warnning in layer.to

63f1e6bd

Add fused attention op backward and python layer. (#36498) · 5119428e

由 Li Min 提交于 10月 26, 2021

功能：本PR的目标是提高attention模块的计算性能。
为了减少框架层对op的调度开销，本PR通过在C++层手动实现attention模块，对外提供attention 大op；
为了减少防存开销，本PR采取了两种优化方法：
（1）在q,k,v计算时通过共享输入X，将该处的gemm，transpose和bias add从三次调用减少为一次；
（2）使用kernel融合优化技术，在不同cuda kernel之间通过寄存器传输数据；

5119428e

F

roll_op: support Tensor as input for shifts (#36727) · 7b1e30fc
由 Feiyu Chan 提交于 10月 26, 2021

7b1e30fc
L
[new-exec] cache exception in child thread (#36692) · 87fbbd36
由 Leo Chen 提交于 10月 26, 2021
```
* cache exception in child thread

* add ut

* fix ut
```
87fbbd36
H
Modify paddle.static.nn.cond doc (#36694) · eb9ef885
由 Huihuang Zheng 提交于 10月 26, 2021
```
Update `cond` English document
```
eb9ef885
L
Move fused_attention and fused_feedforward functional api path to incubate (#36704) · 9aeca2f1
由 Li Min 提交于 10月 26, 2021
```
将 #35905 和 #35843 PR中新增的的python api接口移到incubate目录下。
```
9aeca2f1

[NPU] fix argsort op, test=develop (#36576) · 3523bbe8

由 Qi Li 提交于 10月 26, 2021

* [NPU] fix argsort op, test=develop

* remove debug files, test=develop

* fix typo, test=develop

* address review comments, test=develop

3523bbe8

fix wrong trt dim when input dim is 2 (#36614) · 43dcf235

由 baoachun 提交于 10月 26, 2021

* fix wrong trt dim when input dim is 2

* update leaky_relu and instance_norm converter unit test

* add instance_norm input dim check

43dcf235

move fft and signal files, move signal APIs (#36540) · 81e0c1ba

由 Feiyu Chan 提交于 10月 26, 2021

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

81e0c1ba

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul,... · 93c591e2

由 Wangzheee 提交于 10月 26, 2021

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul, mul) convert pass, fix (matmul, mul) op_teller (#36652)

* new_Matmul2ToMatmulToMul

* new_Matmul2ToMatmulToMul

* fix paddle_pass_builder

* fix paddle_pass_builder

* fix paddle_pass_builder

* tem

* tem

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* add matmul_broadcast_unitest

* fix op_teller

93c591e2

Support various length support for SelectedRows in GLOO::AllGather (#36637) · eca78a9f

由 xiongkun 提交于 10月 26, 2021

* In cpu parallel using gloo, add various length support for SelectedRows

* fix bug

* fix bugs

* fix by code review

* remove timeout

eca78a9f

J
Fix conv2d convert case (#36699) · db633aff
由 JingZhuangzhuang 提交于 10月 25, 2021
```
* fix pool2d convert case

* add pool2d convert test case for trt6
```
db633aff
F

Pool3d 2.0 (#36545) · 229bae81
由 feng_shuai 提交于 10月 26, 2021

229bae81

25 10月, 2021 1 次提交
- A
  [NPU] modifications for model ernie-1.0 (#36642) · 19b02d95
  由 Aganlengzi 提交于 10月 25, 2021
```
* [NPU] modifications for model ernie-1.0

* rollback 503003 and change cast to dtype
```
  19b02d95

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致