提交 · e135069de04fd5b67b718cff14c86d00cee991a6 · PaddlePaddle / Paddle

22 5月, 2023 10 次提交
- Z
  
  [xpu][infer] support runtime configs (#53595) · e135069d
  由 zhupengyang 提交于 5月 22, 2023
  
  e135069d
- Y
  
  fix inference demo CMakeLists.txt (#53994) · d327d3e1
  由 Yuanle Liu 提交于 5月 22, 2023
  
  d327d3e1
- J
  
  fix device changed in setitem-numpy case (#53987) · ae35f502
  由 JYChen 提交于 5月 22, 2023
  
  ae35f502
- W
  
  revert logical for xpu. (#53976) · 89b73ef1
  由 Wilber 提交于 5月 22, 2023
  
  89b73ef1
- Y
  [Inference] add config.enable_low_precision_io api and remove rely on... · d1bbd900
  由 Yuanle Liu 提交于 5月 22, 2023
```
[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt (#52485)
```
  d1bbd900
- Z
  [Paddle Inference] Fix transfer_layout when input size if too big (#53881) · 5ac8c040
  由 zhoutianzi666 提交于 5月 22, 2023
```
* fix transfer_layout when input size if too big
* do not add TransferLayoutKernelGPU
* add int64 and add check
```
  5ac8c040
- Z
  
  [XPU] batch_norm_grad support float16 for xpu (#53977) · 934d8b89
  由 zhangyikun02 提交于 5月 22, 2023
  
  934d8b89
- T
  Add multiclass_nms3 GPU kernel (#52401) · f71c805e
  由 Tian Zheng 提交于 5月 22, 2023
```
* Add GPU kernel for multiclass_nms3 op

* Make multiclass_nms3 gpu kernel output consistent with cpu kernel

* Fix API incompatibility

* Fix unittests on builds without CUDA

* Fix ROCM build

* Remove fluid headers; Use default atol for unittest

* Change function and variable naming

* Add comments; Reduce redundant code

* Use paddle test framework
```
  f71c805e
- N
  Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT and backward... · d2fa26f6
  由 niuliling123 提交于 5月 22, 2023
```
Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT  and backward has nan/inf  (#52808)
```
  d2fa26f6
- W
  [XPU] bind 3D grid sample, fix edge cases in slice & reshape (#53981) · e5021ee9
  由 wangshengxiang 提交于 5月 22, 2023
```
* bind xpu op: 3D grid sample

* fix edge cases in xpu op: reshape & slice
```
  e5021ee9
20 5月, 2023 3 次提交
- S
  
  add info for topology (#54000) · 83a12b11
  由 ShenLiang 提交于 5月 20, 2023
  
  83a12b11
- Z
  
  fix ir program delete bug (#53978) · 8acbf10b
  由 zhangbo9674 提交于 5月 20, 2023
  
  8acbf10b
- Z
  [IR] Add types and attributes to builtin and pd dialect (#53953) · 5c10be4f
  由 zhangbo9674 提交于 5月 20, 2023
```
* add types and attributes

* remove some const_cast

* refine code
```
  5c10be4f
19 5月, 2023 25 次提交
- S
  
  [Inference] Save optimized model by pass (#53696) · fa08a514
  由 shentanyue 提交于 5月 19, 2023
  
  fa08a514
- F
  Improve stablity of Paddle-TensorRT FP16 UT GitHub (1) (#51554) · 645e81f0
  由 Frank Lin 提交于 5月 19, 2023
```
* Improve Readability and Overall Clarity of Logging

* Adds the set_input_type API for specifying input data types

* Specifying input data types
```
  645e81f0
- W
  
  [XPU] fix fallback (#53801) · 4b85e5db
  由 wz1qqx 提交于 5月 19, 2023
  
  4b85e5db
- add minimum grad composite rules (#52561) · 97690816
  由 warrentdrew 提交于 5月 19, 2023
```
* add minimum grad composite rules

* add public python api

* fix format

* fix format

* update testcase

* fix testcase

* fix format

* fix cmakelist.txt

* fix format

* fix param problem

* fix op and composite rule

* fix bf16 cpu support problem

* fix bf16 cpu issue

* fix axis error log

* add axis for maximum

* revert commit

* remove .orig

* fix generic problem

* revert max op

* fix axis error

* fix maximum axis

* fix test_check_output

* fix cinn

* fix minimum maximum axis check
```
  97690816
- 王
  
  [IR] fine-tune the implementation of ir component. (#53894) · 9d9f0ce5
  由王明冬提交于 5月 19, 2023
  
  9d9f0ce5
- L
  Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
  由 limingshu 提交于 5月 19, 2023
```
* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  d29c1f8e
- L
  
  fix_windows_static_assert_error (#53750) · 4dc28b54
  由 limingshu 提交于 5月 19, 2023
  
  4dc28b54
- R
  
  fix compile error on custom_devices (#53940) · b086404b
  由 RedContritio 提交于 5月 19, 2023
  
  b086404b
- G
  [tools] add PADDLE_API check file diff approvals (#53956) · 53d4e9e4
  由 gouzil 提交于 5月 19, 2023
```
* [tools] add PADDLE_API check file diff approvals

* [tools] fix determine

* [tools] fix determine

* [tools] Change to full character matching
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>

* [tools] Update echo_line
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>

* [tools] Update check_approval
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>

---------
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>
```
  53d4e9e4
- G
  remove user define grad (#53864) · 0ae552fb
  由 GGBond8488 提交于 5月 19, 2023
```
* remove user define grad

* fix errors

* remove unused self.x_grad, self.out_grad
```
  0ae552fb
- Z
  Add large dim test of log_softmax (#53954) · 1b6972fd
  由 Zhang Zheng 提交于 5月 19, 2023
```
* Add large dim test of log_softmax

* fix
```
  1b6972fd
- G
  
  test,test=develop (#53811) · 10758725
  由 Galaxy1458 提交于 5月 19, 2023
  
  10758725
- G
  
  test,test=develop (#53843) · c1f4005a
  由 Galaxy1458 提交于 5月 19, 2023
  
  c1f4005a
- G
  
  test,test=develop (#53839) · c174aa22
  由 Galaxy1458 提交于 5月 19, 2023
  
  c174aa22
- X
  【prim】merge branch for GradOpMaker codeGen to clear code (#53874) · 6cb53e91
  由 xiaoguoguo626807 提交于 5月 19, 2023
```
* review

* modify opcompat bug

* modify pybind
```
  6cb53e91
- C
  
  fix meshgird and expand_as test (#53951) · 14e0ce71
  由 Charles-hit 提交于 5月 19, 2023
  
  14e0ce71
- D
  delete bf16 of cross entropy (#53922) · 69d3f4e3
  由 Danyang Zhang 提交于 5月 19, 2023
```
* delete bf16 of cross entropy

* delete bf16 of cross entropy
```
  69d3f4e3
- Z
  [Paddle-TRT] Decrease peak memory (#53930) · eb193d8d
  由 zhoutianzi666 提交于 5月 19, 2023
```
* decrease_peak_memory
```
  eb193d8d
- G
  
  test,test=develop (#53931) · 6179a3ec
  由 Galaxy1458 提交于 5月 19, 2023
  
  6179a3ec
- G
  
  test,test=develop (#53818) · 63ffd733
  由 Galaxy1458 提交于 5月 19, 2023
  
  63ffd733
- G
  
  test,test=develop (#53847) · 39f365c4
  由 Galaxy1458 提交于 5月 19, 2023
  
  39f365c4
- G
  
  test,test=develop (#53851) · 16fcbb9b
  由 Galaxy1458 提交于 5月 19, 2023
  
  16fcbb9b
- G
  
  test,test=develop (#53945) · 2f42cb7f
  由 Galaxy1458 提交于 5月 19, 2023
  
  2f42cb7f
- R
  
  [CustomDevice] fix buffered reader exception (#53925) · b922e711
  由 ronnywang 提交于 5月 19, 2023
  
  b922e711
- Z
  
  Move raw kernel to legacy (#53830) · 051add42
  由 zhangyuqin1998 提交于 5月 19, 2023
  
  051add42
18 5月, 2023 2 次提交
- H
  
  [XPU] fix bug on XPUPlace and AllGather (#53926) · 4a4ffe9a
  由 houj04 提交于 5月 18, 2023
  
  4a4ffe9a
- G
  
  test,test=develop (#53938) · 3ad67b9a
  由 Galaxy1458 提交于 5月 18, 2023
  
  3ad67b9a

PaddlePaddle / Paddle 大约 2 年 前同步成功

PaddlePaddle / Paddle
大约 2 年前同步成功