提交 · 53f5edbd1107b22a51e7f025a47ecfec7c438b18 · PaddlePaddle / Paddle

31 3月, 2023 5 次提交
- H
  
  [XPU] register bmm fp16 (#52354) · 53f5edbd
  由 houj04 提交于 3月 31, 2023
  
  53f5edbd
- 张
  [CodeStyle][UP030][UP031][UP032] using f-string (#52062) · 40e4f5a5
  由张春乔提交于 3月 31, 2023
```
* autofix
Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com>

* revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py

* empty commit, trigger ci

* fix test_slice

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
```
  40e4f5a5
- Z
  
  fix xpu fp16 lod_reset (#52346) · b4137338
  由 zhupengyang 提交于 3月 31, 2023
  
  b4137338
- S
  
  fix copyright date in scope_guard.h, test=document_fix (#52026) · 50c949f0
  由 sneaxiy 提交于 3月 31, 2023
  
  50c949f0
- Y
  
  use int64 for c split (#52279) (#52340) · 9fd4fd5f
  由 Yuang Liu 提交于 3月 31, 2023
  
  9fd4fd5f
30 3月, 2023 35 次提交

support prim & cinn test for layer_norm (#51272) · 84504f35

由 Weilong Wu 提交于 3月 30, 2023

* support layer_norm prim and cinn test

* enable cinn test

* fix merge conflict

* polish input for check_output_with_place

* fix merge conflict

* add more test case

* fix merge conflict

* polish test case

* polish op_test

* change ln_g rules

* modify scale is none case

* modify scale is none case

* add public_python_api for check prim

* modify setoutputgrad and fp64bug

* add todo & delete log

* recover

* fix some errors

* recover

* recover

* recover

* recover

* fix merge conflicts

---------
Co-authored-by: Nwangruting <wangruting@baidu.com>

84504f35

[Prim] fix loss of composite rule (#52120) · a4e0f666

由 cyber-pioneer 提交于 3月 30, 2023

* fix_prim

* fix bug

* add note

* fix logic

* fix

* add note

* fix check

* fix bug

* fix bug

* fix bug

* add debug

* fix check

* fix bug

* sync print log

* fix test case

* change default

* change test case time

a4e0f666

Z
move elementwise_raw_kernel to new dir (#51965) · 49461a02
由 zhangyuqin1998 提交于 3月 30, 2023
```
* move elementwise raw

* fix

* fix
```
49461a02
[Zero-Dim] Support broadcast_tensors input 0D and distribution API output 0D (#51721) · 2bd0a946
由 zhouweiwei2014 提交于 3月 30, 2023

2bd0a946
[Bug-fix] fix bug of Tensor.item() when CUDAPinnedPlace (#52322) · 0f9ec013
由 zhouweiwei2014 提交于 3月 30, 2023

0f9ec013
W
[AMP OP&Test] Transpose OP fp16 unitest (#52315) · f1cdd654
由 Wang Xinyu 提交于 3月 30, 2023
```
* transpose fp16 test

* transpose auto tune fp16 test
```
f1cdd654
Z

[Sparse]Fix the bug of elementwise_grad (#52102) · aeb8c2e2
由 zhangkaihuo 提交于 3月 30, 2023

aeb8c2e2
Z

[XPU] add delete_cast_op_pass (#52305) · 8b622d58
由 zhupengyang 提交于 3月 30, 2023

8b622d58
Z
[Move Test] Move prim (#52167) · 3e2d0195
由 Zheng-Bicheng 提交于 3月 30, 2023
```
* update

* update
```
3e2d0195
K
mv paddle/fluid/platform/device/xpu/tests 2 test/xpu/cpp (#52243) · bc5bae16
由 Kim 提交于 3月 30, 2023
```
* mv paddle/fluid/platform/device/xpu/tests 2 test/xpu/cpp

* add missing cmake
```
bc5bae16
Z
[Move Test] Move prim cpp (#52173) · a445466f
由 Zheng-Bicheng 提交于 3月 30, 2023
```
* update

* update

* update
```
a445466f

support complex data types for libpaddle.Tensor's element get and set (#52324) · 13b12457

由 Feiyu Chan 提交于 3月 30, 2023

1. add type caster for paddle's complex type, to allow pybind to automatically cast it with python's complex type;
2. add complex64 and complex128 data type for `libpaddle.Tensor`'s element get and set(which is required to perturb an element to get the numerical derivative)
3. add support for cuda pinned place in `libpaddle.Tensor` element get and set

---
4. fix a bug in op code generation.(Creation of output folder in concurrent with parsing op yamls.)

13b12457

R

[AMP OP&Test] add fp16 test for linspace (#52161) · 40b30f50
由 Roc 提交于 3月 30, 2023

40b30f50

[AMP] Add python API for collecting operator stats. (#52215) · 73544322

由 Yiqun Liu 提交于 3月 30, 2023

* [AMP] Add python API for collecting operator stats.

* Fix import and polish codes.

* Add more unittest.

* Add doc for the new APIs.

73544322

W
add autogen code support for spectral_norm (#52145) · 28927209
由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for spectral_norm

* bug fixed

* fix PR-CI-Static-Check fail
```
28927209

Speedup worker (#51760) · 8ca86d72

由 pangengzheng 提交于 3月 30, 2023

* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory

8ca86d72

Y

adjust binding order (#52225) · 16ec22c4
由 Yuanle Liu 提交于 3月 30, 2023

16ec22c4

[AMP OP&Test]Modify the FP16 and BF16 OpTest of Add_N (#52311) · e3217e3e

由 Vvsmile 提交于 3月 30, 2023

* adjust defalut tolerance of output and grad

* fix a bug in the grad of OpTest

* fix the type of setting defalut value in optest, both forward and
backward

* add defalut

* fix test_sum_op

* fix test_sum_op test for testing add_n

* modify the add_n op_test

e3217e3e

add scatter composite rule. (#52005) · e16eb22c

由 zxcd 提交于 3月 30, 2023

* add scatter composite rule.

* add public_python_api

* add python unit16 support.

* fix code style.

* add cinn to makelist

* cinn unsupport uint16, forbidden cinn when dtype==uint16.

e16eb22c

Y

add xpu cumprod, group norm grad (#52089) · fb16bdc7
由 ykkk2333 提交于 3月 30, 2023

fb16bdc7

由 huangjiyi 提交于 3月 30, 2023

* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>

93d01787

Z

[XPU] add delete_concat_op_pass (#52304) · 70ebef81
由 zhupengyang 提交于 3月 30, 2023

70ebef81

Fix bug of c_softmax_with_cross_entropy_op_xpu_op (#52296) · 8ef97088

由 Ghost Screaming 提交于 3月 30, 2023

* Support ignore_index for c_softmax_with_cross_entropy_op.

* Polish code. Remove useless comments and add Testcase.

* Polish code for TestCase.

* Polish code.

* Polish code style.

* Polish code.

* Change loss calculation formula and ignore_index dtype.

* Polish TestCase.

* Fix bug of c_softmax_with_cross_entropy_op_xpu_op. Attribute 'ignore_index'
dtype is int64_t.

8ef97088

傅
[AMP&OP_TEST] Fix interp test case (#52282) · dfa893fd
由傅剑寒提交于 3月 30, 2023
```
* delete check_dygraph and use default atol,max_relative_error

* add test case for bicubic_interp
```
dfa893fd
Y
[AMP OP&Test] Register FP16 for multinomial. (#52107) · 7788b65e
由 yunyaoXYY 提交于 3月 30, 2023
```
* add FP16 for multinomial

* fix input data

* update code

* fix FP16

* fix code
```
7788b65e
F

rename Scalar related utility functions(use CamelCase) (#52280) · e5a0dc31
由 Feiyu Chan 提交于 3月 30, 2023

e5a0dc31
K
[Perf] remove sync_calc_stream and sync_comm_stream (#51989) · 0f4229c5
由 kangguangli 提交于 3月 30, 2023
```
* remove sync_calc_stream and sync_comm_stream

* fix ci bug

* fix

* fix

* fix
```
0f4229c5

support auto generate for prelu (#51913) · d1c7b386

由 Ainavo 提交于 3月 30, 2023

* support auto generate for prelu

* op_compat 中增加输入参数

* del attrs ; add kernel data_type

* add PreluGradInferMeta

d1c7b386

Z

[AMP] use promote dtype when amp_level=O2 (#51063) · 6f8ab1fa
由 Zhang Ting 提交于 3月 30, 2023

6f8ab1fa
W
[AMP OP&Test] Strided slice fp16 and bf16 unitest (#52220) · 5cdd9f2c
由 Wang Xinyu 提交于 3月 30, 2023
```
* stride slice fp16 and bf16 unitest

* fix code style

* add self.dtype
```
5cdd9f2c

[Test Mv] ipu_test (#52143) · 38a477e2

由 gouzil 提交于 3月 30, 2023

* [Test Mv] ipu_test

* [Test Mv] cmake add py_test_modules

* [Move Test] rm py_test_modules

* rm asp

38a477e2

R

fix gcc12 error (#52318) · 77b7765f
由 risemeup1 提交于 3月 30, 2023

77b7765f
G
add autogen code support for sigmoid_cross_entropy_with_logits (#52263) · 710c13ed
由 gouzil 提交于 3月 30, 2023
```
* add autogen code support for sigmoid_cross_entropy_with_logits

* add inplace
```
710c13ed
W
add autogen code support for merge_selected_rows (#52274) · 6cd3575c
由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for merge_selected_rows

* bug fixed
```
6cd3575c
W
force sync batch norm grad sequential (#52268) · 336160cf
由 wanghuancoder 提交于 3月 30, 2023
```
* force sync batch norm grad sequential
```
336160cf

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功