提交 · 13b12457f6e0a677e083f27849c73ff801c151f2 · PaddlePaddle / Paddle

30 3月, 2023 28 次提交

support complex data types for libpaddle.Tensor's element get and set (#52324) · 13b12457

由 Feiyu Chan 提交于 3月 30, 2023

1. add type caster for paddle's complex type, to allow pybind to automatically cast it with python's complex type;
2. add complex64 and complex128 data type for `libpaddle.Tensor`'s element get and set(which is required to perturb an element to get the numerical derivative)
3. add support for cuda pinned place in `libpaddle.Tensor` element get and set

---
4. fix a bug in op code generation.(Creation of output folder in concurrent with parsing op yamls.)

13b12457

R

[AMP OP&Test] add fp16 test for linspace (#52161) · 40b30f50
由 Roc 提交于 3月 30, 2023

40b30f50

[AMP] Add python API for collecting operator stats. (#52215) · 73544322

由 Yiqun Liu 提交于 3月 30, 2023

* [AMP] Add python API for collecting operator stats.

* Fix import and polish codes.

* Add more unittest.

* Add doc for the new APIs.

73544322

W
add autogen code support for spectral_norm (#52145) · 28927209
由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for spectral_norm

* bug fixed

* fix PR-CI-Static-Check fail
```
28927209

Speedup worker (#51760) · 8ca86d72

由 pangengzheng 提交于 3月 30, 2023

* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory

8ca86d72

Y

adjust binding order (#52225) · 16ec22c4
由 Yuanle Liu 提交于 3月 30, 2023

16ec22c4

add scatter composite rule. (#52005) · e16eb22c

由 zxcd 提交于 3月 30, 2023

* add scatter composite rule.

* add public_python_api

* add python unit16 support.

* fix code style.

* add cinn to makelist

* cinn unsupport uint16, forbidden cinn when dtype==uint16.

e16eb22c

Y

add xpu cumprod, group norm grad (#52089) · fb16bdc7
由 ykkk2333 提交于 3月 30, 2023

fb16bdc7

由 huangjiyi 提交于 3月 30, 2023

* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>

93d01787

Z

[XPU] add delete_concat_op_pass (#52304) · 70ebef81
由 zhupengyang 提交于 3月 30, 2023

70ebef81

Fix bug of c_softmax_with_cross_entropy_op_xpu_op (#52296) · 8ef97088

由 Ghost Screaming 提交于 3月 30, 2023

* Support ignore_index for c_softmax_with_cross_entropy_op.

* Polish code. Remove useless comments and add Testcase.

* Polish code for TestCase.

* Polish code.

* Polish code style.

* Polish code.

* Change loss calculation formula and ignore_index dtype.

* Polish TestCase.

* Fix bug of c_softmax_with_cross_entropy_op_xpu_op. Attribute 'ignore_index'
dtype is int64_t.

8ef97088

Y
[AMP OP&Test] Register FP16 for multinomial. (#52107) · 7788b65e
由 yunyaoXYY 提交于 3月 30, 2023
```
* add FP16 for multinomial

* fix input data

* update code

* fix FP16

* fix code
```
7788b65e
F

rename Scalar related utility functions(use CamelCase) (#52280) · e5a0dc31
由 Feiyu Chan 提交于 3月 30, 2023

e5a0dc31

support auto generate for prelu (#51913) · d1c7b386

由 Ainavo 提交于 3月 30, 2023

* support auto generate for prelu

* op_compat 中增加输入参数

* del attrs ; add kernel data_type

* add PreluGradInferMeta

d1c7b386

Z

[AMP] use promote dtype when amp_level=O2 (#51063) · 6f8ab1fa
由 Zhang Ting 提交于 3月 30, 2023

6f8ab1fa
W
[AMP OP&Test] Strided slice fp16 and bf16 unitest (#52220) · 5cdd9f2c
由 Wang Xinyu 提交于 3月 30, 2023
```
* stride slice fp16 and bf16 unitest

* fix code style

* add self.dtype
```
5cdd9f2c
R

fix gcc12 error (#52318) · 77b7765f
由 risemeup1 提交于 3月 30, 2023

77b7765f
G
add autogen code support for sigmoid_cross_entropy_with_logits (#52263) · 710c13ed
由 gouzil 提交于 3月 30, 2023
```
* add autogen code support for sigmoid_cross_entropy_with_logits

* add inplace
```
710c13ed
W
add autogen code support for merge_selected_rows (#52274) · 6cd3575c
由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for merge_selected_rows

* bug fixed
```
6cd3575c
W
force sync batch norm grad sequential (#52268) · 336160cf
由 wanghuancoder 提交于 3月 30, 2023
```
* force sync batch norm grad sequential
```
336160cf
J

[Test Mv] remove infrt (#52270) · 551ff882
由 jjyaoao 提交于 3月 30, 2023

551ff882
R

Skip device transfer when arg-defs is set to Allbackend (#52294) · 54497c47
由 Ruibiao Chen 提交于 3月 30, 2023

54497c47
S
[BugFix]Fix segment fault in order setting (#52293) · d2cdc7e3
由 ShenLiang 提交于 3月 29, 2023
```
* fix bug in proto

* add utest
```
d2cdc7e3
D

fix the compare in PD_MEA_CHECK_OVERFLOW (#52300) · 155018ee
由 Danyang Zhang 提交于 3月 30, 2023

155018ee
L
Change some op with xpu control (#52067) · 1faa06f0
由 lzydev 提交于 3月 30, 2023
```
* change op with xpu

* change range yaml

* fix bug in generate_op.py
```
1faa06f0

[CINN] pass global seed to CINN (#52078) · 94aea284

由 jiangcheng 提交于 3月 30, 2023

* [CINN] pass global seed to CINN

* fix cu not include cinn/runtime/flags.h bug

* fix DefaultCUDAGenerator should has device id bug

94aea284

[CodeStyle][C416][C417] rewrite unnecessary comprehension with function call... · 929892c3

由 cyberslack_lee 提交于 3月 30, 2023

[CodeStyle][C416][C417] rewrite unnecessary comprehension with function call and use generator instead of map (#52140)

* codestyle c416 c417

* fix error

* fix inc

* unify all C4 rules into one

* fix inc

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

929892c3

Add Gloo SendRecv Function (#52221) · b8850521

由 yuehuayingxueluo 提交于 3月 30, 2023

* add gloo  send_recv

* fix code_stype

* fix CI bug

* fix send_recv.cc

* add send_recv without sync_op

* fix send_recv test

* fix gather.cc

b8850521

29 3月, 2023 12 次提交

[AMP OP&Test] pad3d add unittests of fp16 and bf16 (#51015) · f86d0be7

由 zengshao0622 提交于 3月 29, 2023

* pad3d add unittests of fp16 and bf16

* pad3d add unittests of fp16 and bf16

* fix cuda place

* fix random to uniform

* fix class name

* fix fp16 max relative error to 1.5e-3

* add dytpe register for onednn

* add pad uint16 check of common.py

* remove check_eager

* test_check_grad --> test_check_grad_normal

f86d0be7

J
Clear the infrt-related code (#52273) · da5a2584
由 jjyaoao 提交于 3月 29, 2023
```
* Clear the infrt-related code

* remove tools/infrt
```
da5a2584

Add group_norm composite rule (#51874) · cabf3921

由 Yichen Zhang 提交于 3月 29, 2023

* add group_norm composite rule

* add test for scale_grad and bias_grad

* resolve conflicts

* remove amp in composite_rule.py

* add float16 test

* deal with NHWC format

* keep the composite rule in float16 identical as original kernel

* resolve conflicts

cabf3921

W
Del old dygraph optest8 (#52094) · d612faf5
由 wanghuancoder 提交于 3月 29, 2023
```
* delete old dygraph op test
```
d612faf5

Add output defines for graph_sample_neighbors and group_norm (#51503) · 37bd7e78

由 hjyp 提交于 3月 29, 2023

* regist output type for GraphSampleNeighbors and GroupNorm

* Update return type

* fix return type

* update

* fix detail

37bd7e78

C

Fix the type conflicts against the openblas (#52187) · a5ca2672
由 chenxujun 提交于 3月 29, 2023

a5ca2672

[CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output (#52214) · fc02b1e6

由 HongyuJia 提交于 3月 29, 2023

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete dtype,shape func of multi_inplace op

* [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output

* [CustomOP Inplace] Auto-generate python API for inplace vector<Tensor> output

fc02b1e6

G

Fix_Linux_[-Wterminate]warning (#52186) · 225f1af2
由 Galaxy1458 提交于 3月 29, 2023

225f1af2

张

[CodeStyle][UP034] remove (()) cases (#52060) · c0697296

由张春乔提交于 3月 29, 2023

* add up34

* modify var name in loop

* revert changes in test_slice

* Revert "modify var name in loop"

This reverts commit 6d748e371afb417054ed0c6b36fd11e87959a90d.

* temporarily ignore test_slice.py

* add comment

* empty commit, re-trigger all ci

* fix inc

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

c0697296

S
[BugFix] fix compute error in fused_dropout_add (#52261) · 8082ba8a
由 ShenLiang 提交于 3月 29, 2023
```
* fix bg

* add utest

* add utest
```
8082ba8a
[Zero-Dim] change Tensor.numpy() usage to other equivalent usage, avoid hack (#52197) · 73df2b1e
由 zhouweiwei2014 提交于 3月 29, 2023

73df2b1e

tanh_double_grad_rules (#52192) · d966301e

由 xiaoguoguo626807 提交于 3月 29, 2023

* tanh_double_grad_rules

* delete log got api_base

* modify composite yaml

* optimize rules

d966301e

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功