提交 · 3e65641444bcb9c5cad181f223afad462008867e · BaiXuePrincess / Paddle

02 2月, 2023 6 次提交
- R
  Fix div 0 error of case10: paddle.nn.functional.max_pool2d/max_pool3d (#50012) · 1451fa51
  由 RedContritio 提交于 2月 02, 2023
```
* add stride check for PoolOutputSize

* add unittest
```
  1451fa51
- R
  
  fix build_ci bug,test=docuement_fix (#46467) · 24e395f6
  由 risemeup1 提交于 2月 02, 2023
  
  24e395f6
- C
  Several ops support zero dim on GPU and CPU (#49959) · 5db88d08
  由 Ccc 提交于 2月 02, 2023
```
* paddle.nn.functional.softmax
* paddle.nn.functional.log_softmax
* paddle.nn.functional.gumbel_softmax
* paddle.nn.functional.prelu
```
  5db88d08
- Y
  [BugFix]Fix bugs when compile with OneDNN (#50096) · 3c557e2f
  由 YuanRisheng 提交于 2月 02, 2023
```
* fix bugs

* fix ci bugs
```
  3c557e2f
- L
  
  Fix the FP16 precision problem of add_n. (#50129) · 14dd68e1
  由 liuruyan 提交于 2月 02, 2023
  
  14dd68e1
- H
  jit layer optimzer model param memory usage (#50135) · ec6e0a2c
  由 Hui Zhang 提交于 2月 02, 2023
```
* jit layer support multi thread
```
  ec6e0a2c
01 2月, 2023 19 次提交

Y

Fused attention pass fwd, create the fused_attention op. (#50125) · 2b848aef
由 Yuang Liu 提交于 2月 01, 2023

2b848aef
R

add information of build_size (#49397) · e6d29e00
由 risemeup1 提交于 2月 01, 2023

e6d29e00
R
Fix UFA非法地址访问(UFA illegal address access) of case3: paddle.crop (#49994) · 34bf3d09
由 RedContritio 提交于 2月 01, 2023
```
* add range check for crop_kernel

* remove shape negative check

* add unittest
```
34bf3d09
R
Fix 空指针 (Null pointer) of case8: paddle.slice (#49979) · 3cf50f91
由 RedContritio 提交于 2月 01, 2023
```
* add check for input of slice

* add unittest
```
3cf50f91
R
Fix div 0 error of case11: paddle.nn.functional.max_pool1d/max_pool2d/max_pool3d (#50010) · 3ab6faa8
由 RedContritio 提交于 2月 01, 2023
```
* add stride check for MaxPool

* add unittests
```
3ab6faa8

[Zero-Dim] Fix 0-dim tensor for arg_min_max op. (#49570) · e4e94a88

由 Zhong Hui 提交于 2月 01, 2023

* fix 0-d tensor for arg_min_max op.

* fix xpu.

* fix zero dims

* fix

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update test_zero_dim_tensor.py

* Update test_zero_dim_tensor_xpu.py

* Update test_zero_dim_tensor.py

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

e4e94a88

run infer ut in A10 (#48535) · 71f247b1

由 YUNSHEN XIE 提交于 2月 01, 2023

* run infer ut in A10

* 增加cuda11.2-cudnn8-trt8.4镜像

* add paddle_coverage_new.sh

71f247b1

Preln fix (#49802) · e03718f5

由 Wang Bojun 提交于 2月 01, 2023

* preln_residual 2 fused_bias_residual

* skip layernorm fix and ut

* code refine

* code style refine

* fix ut

* fix output

* add trt layer fall back info

* refine op teller and ut

* DropoutMaskOut output fix

e03718f5

jit layer support multi thread and fix predictor clone (#50095) · 9fa2eb38

由 Hui Zhang 提交于 2月 01, 2023

* jit layer support multi thread

* fix bug

* clone prediector not do graph optimizer

* format

* fix comment and format

* fix override and fromat

* fix

* fix

9fa2eb38

Z

support grid_sampler_grad op for XPU (#49857) · 520f48d6
由 zhangyikun02 提交于 2月 01, 2023

520f48d6
G
[Divide by 0 Error] add lu check (#49974) · f71796b6
由 gouzil 提交于 2月 01, 2023
```
* [Divide by 0 Error] add lu check

* [Divide by 0 Error] lu check migrate to c++
```
f71796b6

[Divide by 0 Error] add eig check (#49971) · 226a6567

由 gouzil 提交于 2月 01, 2023

* [Divide by 0 Error] add eig check

* [Divide by 0 Error] eig check migrate to c++

* [Divide by 0 Error] Fix class name error

226a6567

[Divide by 0 Error] add norm check (#49966) · 5dfddaea

由 gouzil 提交于 2月 01, 2023

* [Divide by 0 Error] add norm check

* [Divide by 0 Error] fix x AttributeError

* [Divide by 0 Error] norm check migrate to c++

5dfddaea

Combination of multiple paddle::memory::allocate operation into one for ops (#49126) · bdae5481

由 limingshu 提交于 2月 01, 2023

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* fix code according to comments

* fix codes according to  review comments

* adding some function overload

* relocate the power operation.

* add bf16 support for index select relevant ops

* revert bf16 type change.

* add changes for more op

* fix code writting bugs

bdae5481

Z

add dynamic shape support for running paddle-trt in calib_mode (#50033) · af673090
由 zhoutianzi666 提交于 2月 01, 2023

af673090

Fix UFA非法地址访问(UFA illegal address access) of case4: paddle.unbind (#49995) · 9ce8cfcf

由 RedContritio 提交于 2月 01, 2023

* add axis check for unbind

* add axis range check for unbind

* update unittest and axis validation for unbind

* add unittest invalid axis for unbind

* restore axis extract for unbind

9ce8cfcf

L

fix gc and infinite buffer size (#50122) · 3e9d8548
由 LiYuRio 提交于 2月 01, 2023

3e9d8548
A
[PrimCinn]Fix some vars are wrongly gc in CINN+InterpreterCore (#50116) · 9f231147
由 Aurelius84 提交于 2月 01, 2023
```
* [PrimCinn]Fix some vars are wrongly gc in CINN+InterpreterCore

* fix baseline unittest config

* fix code style
```
9f231147

H2D data transfer optimization for split kernel (#49086) · 057ba778

由 limingshu 提交于 2月 01, 2023

* profile reduce kernel for fp16 and reduceHigherdim

* use reinterpret_cast

* fix for CI on ROCm

* add Macro for ROCm

* ROCm CI config

* ROCm CI config

* unit test repair

* pull

* add common_funcs.h

* reduceType

* Update reduce_function.h

* not higher

* rename

* implement of matmul using cublasLt instead of cublas

* cublasLt bugfix

* Update matmul_kernel_impl.h

* Update matmul_kernel_impl_via_blasLt.h

* for-loop-algo

* PR comments changes

* add macro

* ci unused variable isCublasLt

* ci unused variable isCublasLt macro

* split matmul to autotune

* rewrite the split kernel with segmented_array

* rewrite the split kernel with segmented_array

* rewrite the split kernel with segmented_array

* add some method for cuda_graph

* fix bugs for rocm

* change for ci-error

* i dont know why ci-model-benchmark gives a shit error, so i recover codes with original one to see if original codes work.

* add some changes for passing mode_benchmark and coverage ci

* fix ci error

* fix ci-rocm error

* add some changes for header

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>

057ba778

31 1月, 2023 15 次提交
- R
  
  support empty input for unique_consecutive (#49978) · dc1b6511
  由 RedContritio 提交于 1月 31, 2023
  
  dc1b6511
- W
  gn_silu (#49928) · 111075a3
  由 wenbin 提交于 1月 31, 2023
```
* gn_silu

* add ut

* set TIMEOUT

* correct comments

* comments

* disable windows ut

* rename parameter
```
  111075a3
- W
  
  bind pixel_shuffle & pixel_shuffle_grad op for xpu (#50090) · a5f2e1f7
  由 wangshengxiang 提交于 1月 31, 2023
  
  a5f2e1f7
- W
  Unary (#49914) · 0d9185b9
  由 wenbin 提交于 1月 31, 2023
```
* disable integer

* disable integer

* add cast layer
```
  0d9185b9
- Z
  
  [pass] Upgrade Constant Folding Pass (#49908) · c3cd8502
  由 Zhang Jun 提交于 1月 31, 2023
  
  c3cd8502
- N
  
  Save nan log to file when output_dir is setted (#49200) · c18fddd3
  由 niuliling123 提交于 1月 31, 2023
  
  c18fddd3
- C
  Integrate static code gen info (#49858) · 0e51f398
  由 Charles-hit 提交于 1月 31, 2023
```
* polish static grad op maker gen

* fix some bugs

* fix static code gen

* solve conflict

* modify composite grad maker name

* integrate phi and fluid info in static code gen

* rename some composite maker

* modify static code gen format
```
  0e51f398
- Z
  
  [inference][trt] add elementwise input data type check (#49675) · 5822e15c
  由 Zhang Jun 提交于 1月 31, 2023
  
  5822e15c
- P
  [Numpy] Add FP16 dtype for CastNumpy2Scalar (#50002) · 86a23818
  由 PuQing 提交于 1月 31, 2023
```
* add FP16 dtype for CastNumpy2Scalar

* fix throw message

* add test

* fix SyntaxWarning

* test skip for float16

* fix dtype mistakes
```
  86a23818
- R
  Add unified device management api (#48651) · 7aaaa1c6
  由 ronnywang 提交于 1月 31, 2023
```
* [CustomDevice] add custom device api

* update

* update

* test=document_fix

* update

* update

* add  examples
```
  7aaaa1c6
- R
  Fix 空指针 (Null pointer) of case15: paddle.broadcast_tensors (#49980) · 78ec942b
  由 RedContritio 提交于 1月 31, 2023
```
* fix incorrect output shape of broadcast

* add unittest
```
  78ec942b
- R
  
  fix send start msg (#50085) · 1048b166
  由 Roc 提交于 1月 31, 2023
  
  1048b166
- Z
  
  optimize 2D sync_batch_norm (#49663) · 9a4acfee
  由 zhangkaihuo 提交于 1月 31, 2023
  
  9a4acfee
- Z
  
  not use shm cache default (#50089) · 118aee6f
  由 zhangbo9674 提交于 1月 31, 2023
  
  118aee6f
- Bump Cutlass version to 2.11.0 (#50073) · c64296bf
  由 MarDino 提交于 1月 31, 2023
  
  c64296bf

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致