提交 · 5ed2320c35b973bf0a52a5bb3971a15030cb51c9 · PaddlePaddle / Paddle

20 3月, 2023 19 次提交
- Z
  
  Add LoongArch support (#51109) · 5ed2320c
  由 Zhang Na 提交于 3月 20, 2023
  
  5ed2320c
- L
  Support Linear operation in cuBlaslt and plug into attn_gemm and fusedLinear forward op (#51124) · 2dfc3fa8
  由 limingshu 提交于 3月 20, 2023
```
* optimization for fused linear op

* fix code format

* optimization for linear fused forward

* merge with develop

* fix bugs for gemm_ephilog

* package of cublaslt ephilogue type with enmu

* final fix before code reviewing

* fix missed fusedType typo

* fix code according to review suggestions

* fix windows ci error

* change location of MatmulPlanner

* add some changes for compiler error fix

---------
```
  2dfc3fa8
- A
  [CodeStyle][UP004] remove useless object inheritance (#51771) · 9983892e
  由 Ainavo 提交于 3月 20, 2023
```
* add_up004_for_ruff

* 修改配置文件并清除object

* fix md
```
  9983892e
- [Werror] fix Werror-maybe-uninitialized in roi_align_grad_kernel (#51633) · 5786f3e4
  由 iSerendipity 提交于 3月 20, 2023
```
* fix Werror in roi_align_grad_kernel

* adopt a better way
```
  5786f3e4
- [Zero-Dim] fix Tensor.numpy, cntrol whether to hack process to 1D (#51757) · d7035454
  由 zhouweiwei2014 提交于 3月 20, 2023
  
  d7035454
- Z
  Register custom kernel for some all_bakcend kernel (#51639) · e8530a35
  由 zyfncg 提交于 3月 20, 2023
```
* register some custom kernel

* fix bug
```
  e8530a35
- H
  
  [CustomOP unittest] Add customOP multiple inplace unittest (#51758) · 5b6d2f85
  由 HongyuJia 提交于 3月 20, 2023
  
  5b6d2f85
- W
  
  add sigmoid custom grad for prim (#51768) · ac47d003
  由 Weilong Wu 提交于 3月 20, 2023
  
  ac47d003
- M
  
  [xpu] fused_multi_transformer_xpu pass&kernel support (#51571) · 52e1742f
  由 mayang002 提交于 3月 20, 2023
  
  52e1742f
- S
  [Hackathon NO.71] 为 Paddle-TRT 添加 pad3d 算子 (#50986) · c36e3fd2
  由 Sonder 提交于 3月 20, 2023
```
* update codes about pad3d

* add codes about Tensor type Padding

* update

* 更新单测文件

* format code style

* update and to &&'

* rewrite codes about pad3d

* add codes about converting paddle pad format to tensorrt pad format

* fix some errors

* 指定trt版本范围

* 修正dims初始化方式

* fix code style

* update test pad values

* 指定pad3d trt版本

* 更新 单测 文件范围

* 更新单测文件

* update pad3d paddings convert codes

* update pad3d

* add static mode support

* update test file

* fix bugs about dynamic mode test codes

* fix bug and add limite in op_teller

* use a new padding convert method[ITensor* padding with using Slice to split the pre_pad and the  post pad]

* fix PADDLE_THROW grammaly error

* update test codes

* 添加对于Tensor padding 的 size 判断
```
  c36e3fd2
- T
  
  Mv phi and fluid/test To test dir (#50640) · e808fa30
  由 tianshuo78520a 提交于 3月 20, 2023
  
  e808fa30
- fill_constant_batch_size_like support bf16 (#51396) · 2a0bd17c
  由 FormlessUnit 提交于 3月 20, 2023
```
shape support bf16
```
  2a0bd17c
- Fix unsqueeze with empty axis bug (#51828) · 7a79fd88
  由 zhouweiwei2014 提交于 3月 20, 2023
  
  7a79fd88
- Y
  [XPU] add pool3dgrad special dim support (#51727) · 4851c642
  由 ykkk2333 提交于 3月 20, 2023
```
* add xpu tile and concat kernel int64, test=kunlun

* fix previous xpu dataoader bug, and add maxpool3dgrad special dim support, test=kunlun
```
  4851c642
- X
  [Dy2static] Auto Remove Step Scope while GradRunProgramNode GCed. (#51411) · 4f32aae5
  由 xiongkun 提交于 3月 20, 2023
```
* merge

* fix bugs while backward multi-times.

* code format by ci
```
  4f32aae5
- H
  
  update (#51773) · 7f506669
  由 Huang Jiyi 提交于 3月 20, 2023
  
  7f506669
- J
  
  support relue custom vjp (#51742) · 604b7a53
  由 Jiabin Yang 提交于 3月 20, 2023
  
  604b7a53
- H
  [Tensor Operants & Prim-Relevant] Tensor supports compare operants (#51713) · 6bfb8152
  由 HongyuJia 提交于 3月 20, 2023
```
* [Tensor Operants & Prim-Relevant] Tensor supports compare operants

* fix dependence of test_comp_static

* fix unit test
```
  6bfb8152
- W
  
  refine eager code gen (#51746) · af95a8b4
  由 wanghuancoder 提交于 3月 20, 2023
  
  af95a8b4
19 3月, 2023 3 次提交
- C
  
  modify tests name (#51739) · 41607667
  由 Charles-hit 提交于 3月 19, 2023
  
  41607667
- D
  resgister for ftt_r2c, ftt_c2_r (#51563) · d431b7c7
  由 Difer 提交于 3月 19, 2023
```
* resgister for ftt_r2c, ftt_c2_r

* fix clang-format
```
  d431b7c7
- S
  [phi] Add output defs for argsort kernel (#51407) · 545e20f8
  由 Sanbu 提交于 3月 19, 2023
```
* Add output defs for argsort kernel

* Update argsort_kernel.cc

* Update argsort_kernel.cu

* Update argsort_kernel.cc
```
  545e20f8
18 3月, 2023 1 次提交
- L
  
  fix cinn_instruction_run inplace var not found problem (#51769) · f5811a60
  由 Leo Chen 提交于 3月 18, 2023
  
  f5811a60
17 3月, 2023 9 次提交
- D
  【Hackathon No.46】为 Paddle gumbel_softmax 算子实现 float16 数据类型支持 (#50923) · e0007f31
  由 denglianbin 提交于 3月 17, 2023
```
* finish task

* fix some question.

* fix error

* change unittest:zeroDim.
```
  e0007f31
- I
  
  【Hackathon No58】fix atan2 (#51185) · b94fe95a
  由 Infinity_lee 提交于 3月 17, 2023
  
  b94fe95a
- P
  [PHI] Add multinomial output defs (#51357) · b647c2f0
  由 PuQing 提交于 3月 17, 2023
```
* add multinomial output defs

* fix register on gpu
```
  b647c2f0
- Z
  [AMP OP&Test] Support float & bfloat16 when using thrust (#51627) · 3b2cd23a
  由 Zhang Zheng 提交于 3月 17, 2023
```
* [AMP OP&Test] Support float & bfloat16 when using cub

* fix compile error

* fix

* fix rocm compile error
```
  3b2cd23a
- C
  
  Fix paddle.incubate.graph_reindex divide by 0 error (#51714) · ef599afe
  由 chenxujun 提交于 3月 17, 2023
  
  ef599afe
- G
  [phi][jit] clean paddle/phi/kernels/jit Unused methods (#51446) · 6aa3670f
  由 gouzil 提交于 3月 17, 2023
```
* [phi][jit] rm Softmax StrideScal

* [phi][jit] rm kStrideScal

* [phi][jit] fix Softmax clean omission

* [phi][jit] fix Softmax clean omission

* [phi][jit] fix StrideScal clean omission

* [phi][jit] fix mkl SoftmaxKernel clean omission

* [phi][jit] fix test error

* [phi][jit] fix test error

* [phi][jit] rm NCHW16CMulNC

* [phi][jit] fix test error

* [phi][jit] rm HSum HMax

* [phi][jit] fix test error

* [phi][jit] rm StrideASum

* add AUTHORS.md

* [phi][jit] fix test error
```
  6aa3670f
- L
  support fetch empty tensor on CPUPlace (#51735) · ea22fdb0
  由 Leo Chen 提交于 3月 17, 2023
```
* support fetch empty tensor on CPUPlace

* fix the shape in unittest of empty output
```
  ea22fdb0
- H
  
  [Polish utils/pybind.h] Delete p_tensor_type in pybind.h (#51715) · a698eb7d
  由 HongyuJia 提交于 3月 17, 2023
  
  a698eb7d
- C
  [Prim] support batch_norm vjp (#51283) · ff40a7e5
  由 cyber-pioneer 提交于 3月 17, 2023
```
* add bn vjp

* fix example

* fix code

* fix code

* fix cinn case

* fix code

* fix example

* fix code

* fix example

* fix example
```
  ff40a7e5
16 3月, 2023 8 次提交

[Custom Operator] Custom op support inplace mechanism (#51620) · f824bc0d

由 HongyuJia 提交于 3月 16, 2023

* init unit test commit, contains register thinking

* support inplace

* get inplaced x.grad

* Try support inplace and hook at the same time

* Support inplace, need debug

* Support inplace successfully

* Inplace use Tensor&, consistent with Tensor*

* fix MapPlainOutputs bug

* fix double grad inplace error

f824bc0d

C
rename flash_attn_raw to flash_attn_unpadded (#51704) · 0b778bdc
由 Chitsing KUI 提交于 3月 16, 2023
```
* rename flash_attn_raw to flash_attn_unpadded

* fix static api

* fix static return
```
0b778bdc
X
Add Deformable Conv Dynamic Shape Support (#50698) · 86bf8274
由 xjmxyt 提交于 3月 16, 2023
```
* add dynamic support

* add more test

* fix bug

* change test

* change test
```
86bf8274

add fp32 grad plus fp16 param in adamw (#51141) · 290aa368

由 shaojie_wang 提交于 3月 16, 2023

* add fp32 grad plus fp16 param in adamw

* add python UT

* fix test case

* in test_adamw_op py file, force the moment2 value LE 0

* add a compare option

* remove bf16 fused adam kernel case

290aa368

Update from_blob API (#51646) · c07c7712

由 Huang Jiyi 提交于 3月 16, 2023

* remove contexts in tensor_utils

* update from_blob

* update from_blob

* update from_blob

* fix bug

* fix bug

c07c7712

[Auto Parallel Performance] Support BF16 Training (#51285) · 9ded5707

由 JZ-LIANG 提交于 3月 16, 2023

* update env setting

* update pass logic

* dist op support bf16

* backward cast update

* update setting

* update backward

* revert amp pass

* update fp16 backward logic

* register c_embedding bf16

* revert engine

* add unitest

* add unitest

* update unitest

* update cmake

* update math

* update math.py

* update unitest

* update unitest

* revise unitest

* revise unitest

* update unitest

* update unitest

* update unitest

9ded5707

P
[PHI] Add rnn and searchsorted output defs (#51360) · 3094d475
由 PuQing 提交于 3月 16, 2023
```
* add rnn and searchsorted output defs

* add gpu kernel
```
3094d475
H
[phi decoupling] remove fluid gpu_info usage in phi (#51699) · 907433a7
由 Huang Jiyi 提交于 3月 16, 2023
```
* remove fluid thread_data_registry

* update

* fix bug
```
907433a7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功