提交 · 2bfe8b2c8584ab116c8faa5f7c6c1a09f5d024d0 · BaiXuePrincess / Paddle

02 6月, 2022 3 次提交
- G
  
  fix the bug of margin cross entropy loss for eager mode (#43161) · 67163fb4
  由 Guoxia Wang 提交于 6月 02, 2022
  
  67163fb4
- L
  Extend forward fast layer_norm kernel to support more dimensions. (#43118) · 85baa3c0
  由 Li Min 提交于 6月 02, 2022
```
* extend forward fast_ln_kernel to support more column values.
```
  85baa3c0
- S
  Support CUDA Graph for partial graph in dygraph mode (#42786) · d05b940a
  由 sneaxiy 提交于 6月 02, 2022
```
* support CUDAGraph for partial graph

* add ut

* fix ci

* fix ut again because of eager mode

* fix kunlun ci

* fix win ci
```
  d05b940a
01 6月, 2022 11 次提交
- Y
  Add yaml and unittest for instance_norm op (#43060) · 56ae33b6
  由 YuanRisheng 提交于 6月 01, 2022
```
* add yaml

* fix infrt compile bugs
```
  56ae33b6
- A
  
  [fix] split nanmedian fluid deps (#43135) · b23914c2
  由 Aganlengzi 提交于 6月 01, 2022
  
  b23914c2
- Q
  
  [NPU] fix npu runtime error of HCCLParallelContext, test=develop (#43116) · 2dac35f3
  由 Qi Li 提交于 6月 01, 2022
  
  2dac35f3
- G
  
  support nccl api for bfloat16, required >= cudnn 10.1, nccl >= 2.10.3 (#43147) · 67b9b51b
  由 Guoxia Wang 提交于 6月 01, 2022
  
  67b9b51b
- S
  Make fuse_gemm_epilogue support transpose_x and transpose_y (#40558) · 048b0013
  由 sneaxiy 提交于 6月 01, 2022
```
* support weight transpose

* add ut

* add template

* fix transpose error

* fix transpose_comment

* add api tests

* add skipif

* add doc
```
  048b0013
- Y
  
  remove skip ci directly when the pr is approved (#43130) · 07993044
  由 YUNSHEN XIE 提交于 6月 01, 2022
  
  07993044
- S
  
  code format check upgrade step1: pre-commit, remove-ctrlf, pylint (#43103) · 664758fa
  由 Sing_chan 提交于 6月 01, 2022
  
  664758fa
- Z
  Unittest parallel (#43042) · dc26d07b
  由 zhangchunle 提交于 6月 01, 2022
```
unittest parallel
Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
```
  dc26d07b
- R
  Add pinned memory to host memory stats (#43096) · c4b7c485
  由 Ruibiao Chen 提交于 6月 01, 2022
```
* Add pinned memory to HostMemoryStats

* Add macro for WrapStatAllocator

* Fix CI errors
```
  c4b7c485
- H
  
  [revert] revert inference accelarate #43125 · 81622708
  由 huzhiqiang 提交于 6月 01, 2022
  
  81622708
- C
  [Yaml]add conv3d, depthwise_conv2d yaml (#42807) · 5f2c251c
  由 chentianyu03 提交于 6月 01, 2022
```
* add conv3d yaml

* add conv3d_grad, conv3d_double_grad

* add final_state_conv3d test case

* add conv3d double test case

* add depthwise_conv2d grad yaml

* add depthwise_conv2d double grad test case

* modify the order of args

* add depthwise_conv2d_grad_grad config
```
  5f2c251c
31 5月, 2022 15 次提交

Remove mkldnn attributes from base ops (#42852) · 4b89120b

由 Sławomir Siwek 提交于 5月 31, 2022

* remove attrs from base op

* fix typos

* remove brelu

* undo removing code related to matmul

* remove whitespaces

* undo changes in matmul

* remove empty line

4b89120b

[Eager] Fix Full Zero (#43048) · 462ae005

由 wanghuancoder 提交于 5月 31, 2022

* fix full zero

* fix full zero

* fix full zero

* fix full zero

* refine

* refine

* refine

462ae005

S

put set error_code infront to avoid being skipped (#43014) · d70e45bc
由 Sing_chan 提交于 5月 31, 2022

d70e45bc
C
[Phi] Polish assign kernel copy impl (#43061) · c9e7c407
由 Chen Weihang 提交于 5月 31, 2022
```
* fix assign kernel copy impl

* fix test failed
```
c9e7c407
C

[MLU] add mlu kernel for abs op (#43099) · cb195fa0
由 cambriconhsq 提交于 5月 31, 2022

cb195fa0
C
[Eager] Polish append op using for model perf (#43102) · e9589e35
由 Chen Weihang 提交于 5月 31, 2022
```
* polish append op using

* fix var error

* fix group norm impl
```
e9589e35
A
[NPU] fix arg_max and reduce_max (#42887) · f9e55dee
由 Aganlengzi 提交于 5月 31, 2022
```
* fix arg_max and reduce_max

* add arg_max ut
```
f9e55dee

【PaddlePaddle Hackathon 2】16 新增 API RRelu (#41823) · 21e1d10f

由 thunder95 提交于 5月 31, 2022

* rrelu逻辑部分

* unregistered op kernel (unresolved)

* commit before merge

* 丰富测试用例

* 修复rrelu-sig的bug

* 修复cpu环境测试

* 修改拼写错误

* 修改code format

* 尝试优化测试用例timeout的问题

* 优化测试用例

* 移除seed, 优化随机函数

* update en doc for rrelu

* fix rrelu en docs, test=document_fix

* add paper link for en docs, test=document_fix

* udpate en doc

* add r,test=document_fix

21e1d10f

[EinsumOp] Make EinsumOp support bfloat16. (#43085) · a4bb38cb

由 xiongkun 提交于 5月 31, 2022

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0

* make EInsumOP support bf16

* add unittest for BF16

* add condition for test_BF16

* fix bugs

* fix

a4bb38cb

L
Fix the underflow of fp16 fake quantize operators (#43088) · 0ae8a2d6
由 Leo Chen 提交于 5月 31, 2022
```
Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
```
0ae8a2d6

Support backward prune for eager intermidiate (#43111) · 4700a08e

由 Jiabin Yang 提交于 5月 31, 2022

* support is empty

* fix error

* fix code error

* change to fake empty

* using fake empty first

* using fake empty first

* Support backward prune in fluid

4700a08e

L
Rename dropout is test (#43098) · 67497119
由 Li Min 提交于 5月 31, 2022
```
* replace dropout_is_test with is_test.
* improve atol on a100.
```
67497119

add embedding yaml (#43029) · 2785f876

由 zyfncg 提交于 5月 31, 2022

* add embedding yaml

* fix infermeta bug

* fix bug of selected_rows infer_meta

* fix selected_rows

* add unittest

2785f876

W

fix slice plugin (#43110) · b779d2b8
由 Wilber 提交于 5月 31, 2022

b779d2b8

OneDNN md-in-tensor refactoring part 5: Memory descriptor enabled for... · 12d8a567

由 jakpiase 提交于 5月 30, 2022

OneDNN md-in-tensor refactoring part 5: Memory descriptor enabled for elementwises, reductions and expand_v2 ops (#43036)

* enabled md in elementwises, reductions and expand_v2

* CI fix for invalid numpy copy

* fixed formatting

* CI rerun

* changes after review

12d8a567

30 5月, 2022 11 次提交
- C
  
  [mlu] add one_hot_v2 mlu kernel (#43025) · 13a21cf7
  由 Chenxiao Niu 提交于 5月 30, 2022
  
  13a21cf7
- L
  Add fused_bias_dropout_residual_ln op and layer. (#43062) · dceccd9d
  由 Li Min 提交于 5月 30, 2022
```
* add fused_bias_dropout_residual_ln op and layer.
```
  dceccd9d
- H
  
  fix scale_matmul fuse pass (#43089) · e1e0deed
  由 heliqi 提交于 5月 30, 2022
  
  e1e0deed
- S
  [TensorRT] Fix delete fill_constant pass (#43053) · 1448520d
  由 shentanyue 提交于 5月 30, 2022
```
* update lite compile cmake

* Update delete_fill_constant_op_pass.cc

* Update analysis_config.cc
```
  1448520d
- P
  support backward inplace in eager fluid dygraph mode (#43054) · ed2886de
  由 pangyoki 提交于 5月 30, 2022
```
* support backward inplace in eager fluid mode

* fix

* fix

* optimize format

* little change
```
  ed2886de
- C
  
  Implement fused_gate_attention operator for AlphaFold. (#42018) · fdcdbec5
  由 crystal 提交于 5月 30, 2022
  
  fdcdbec5
- T
  【PaddlePaddle Hackathon 2】15 新增 API Nanmedian (#42385) · f87fa3c0
  由 thunder95 提交于 5月 30, 2022
```
* nanmedian op

* 修改cuda kernel的bug

* 修复count_if在其他硬件平台不兼容

* 修复某些cpu硬件不兼容

* 修复某些cpu硬件不兼容

* 修复isnan判断

* 兼容numpy低版本不支持全部nan的情况

* 兼容numpy低版本不支持全部nan的情况

* fix code example

* fix api comment error

* 修改反向传播逻辑以及c++处理逻辑

* 完成修改建议

* typo pre_dim

* update en docs, test=document_fix

* remove numpy in en doc, test=document_fix

* add r,test=document_fix

* 添加api到all

* follow advice from chenwhql
```
  f87fa3c0
- H
  
  [Framework]accelerate inference period (#42400) · 5df92262
  由 huzhiqiang 提交于 5月 30, 2022
  
  5df92262
- C
  
  [MLU]add mlu kernel for log_softmax op (#43040) · 586f9429
  由 cambriconhsq 提交于 5月 30, 2022
  
  586f9429
- L
  Optimize memcpy operation in Eigh (#42853) · 806073d6
  由 limingshu 提交于 5月 30, 2022
```
* 1st commit

* fix usless change in header transpose_kernel_h file

* add sync
```
  806073d6
- W
  [Dy2St]Fix cond_block_grad error when handle no need grad vras (#43034) · cd3d0911
  由 WangZhen 提交于 5月 30, 2022
```
* Fix cond_block_grad error when handle no need grad vras

* Add comment and UT
```
  cd3d0911

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致