提交 · 8ca86d72cab74934b6209c84a78b3fb6dd920cbd · PaddlePaddle / Paddle

30 3月, 2023 34 次提交
- P
  Speedup worker (#51760) · 8ca86d72
  由 pangengzheng 提交于 3月 30, 2023
```
* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory
```
  8ca86d72
- Y
  
  adjust binding order (#52225) · 16ec22c4
  由 Yuanle Liu 提交于 3月 30, 2023
  
  16ec22c4
- V
  [AMP OP&Test]Modify the FP16 and BF16 OpTest of Add_N (#52311) · e3217e3e
  由 Vvsmile 提交于 3月 30, 2023
```
* adjust defalut tolerance of output and grad

* fix a bug in the grad of OpTest

* fix the type of setting defalut value in optest, both forward and
backward

* add defalut

* fix test_sum_op

* fix test_sum_op test for testing add_n

* modify the add_n op_test
```
  e3217e3e
- Z
  add scatter composite rule. (#52005) · e16eb22c
  由 zxcd 提交于 3月 30, 2023
```
* add scatter composite rule.

* add public_python_api

* add python unit16 support.

* fix code style.

* add cinn to makelist

* cinn unsupport uint16, forbidden cinn when dtype==uint16.
```
  e16eb22c
- Y
  
  add xpu cumprod, group norm grad (#52089) · fb16bdc7
  由 ykkk2333 提交于 3月 30, 2023
  
  fb16bdc7
- H
  register fluid kerenls to phi [part 1] (#52014) · 93d01787
  由 huangjiyi 提交于 3月 30, 2023
```
* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
```
  93d01787
- Z
  
  [XPU] add delete_concat_op_pass (#52304) · 70ebef81
  由 zhupengyang 提交于 3月 30, 2023
  
  70ebef81
- G
  Fix bug of c_softmax_with_cross_entropy_op_xpu_op (#52296) · 8ef97088
  由 Ghost Screaming 提交于 3月 30, 2023
```
* Support ignore_index for c_softmax_with_cross_entropy_op.

* Polish code. Remove useless comments and add Testcase.

* Polish code for TestCase.

* Polish code.

* Polish code style.

* Polish code.

* Change loss calculation formula and ignore_index dtype.

* Polish TestCase.

* Fix bug of c_softmax_with_cross_entropy_op_xpu_op. Attribute 'ignore_index'
dtype is int64_t.
```
  8ef97088
- 傅
  [AMP&OP_TEST] Fix interp test case (#52282) · dfa893fd
  由傅剑寒提交于 3月 30, 2023
```
* delete check_dygraph and use default atol,max_relative_error

* add test case for bicubic_interp
```
  dfa893fd
- Y
  [AMP OP&Test] Register FP16 for multinomial. (#52107) · 7788b65e
  由 yunyaoXYY 提交于 3月 30, 2023
```
* add FP16 for multinomial

* fix input data

* update code

* fix FP16

* fix code
```
  7788b65e
- F
  
  rename Scalar related utility functions(use CamelCase) (#52280) · e5a0dc31
  由 Feiyu Chan 提交于 3月 30, 2023
  
  e5a0dc31
- K
  [Perf] remove sync_calc_stream and sync_comm_stream (#51989) · 0f4229c5
  由 kangguangli 提交于 3月 30, 2023
```
* remove sync_calc_stream and sync_comm_stream

* fix ci bug

* fix

* fix

* fix
```
  0f4229c5
- A
  support auto generate for prelu (#51913) · d1c7b386
  由 Ainavo 提交于 3月 30, 2023
```
* support auto generate for prelu

* op_compat 中增加输入参数

* del attrs ; add kernel data_type

* add PreluGradInferMeta
```
  d1c7b386
- Z
  
  [AMP] use promote dtype when amp_level=O2 (#51063) · 6f8ab1fa
  由 Zhang Ting 提交于 3月 30, 2023
  
  6f8ab1fa
- W
  [AMP OP&Test] Strided slice fp16 and bf16 unitest (#52220) · 5cdd9f2c
  由 Wang Xinyu 提交于 3月 30, 2023
```
* stride slice fp16 and bf16 unitest

* fix code style

* add self.dtype
```
  5cdd9f2c
- G
  [Test Mv] ipu_test (#52143) · 38a477e2
  由 gouzil 提交于 3月 30, 2023
```
* [Test Mv] ipu_test

* [Test Mv] cmake add py_test_modules

* [Move Test] rm py_test_modules

* rm asp
```
  38a477e2
- R
  
  fix gcc12 error (#52318) · 77b7765f
  由 risemeup1 提交于 3月 30, 2023
  
  77b7765f
- G
  add autogen code support for sigmoid_cross_entropy_with_logits (#52263) · 710c13ed
  由 gouzil 提交于 3月 30, 2023
```
* add autogen code support for sigmoid_cross_entropy_with_logits

* add inplace
```
  710c13ed
- W
  add autogen code support for merge_selected_rows (#52274) · 6cd3575c
  由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for merge_selected_rows

* bug fixed
```
  6cd3575c
- W
  force sync batch norm grad sequential (#52268) · 336160cf
  由 wanghuancoder 提交于 3月 30, 2023
```
* force sync batch norm grad sequential
```
  336160cf
- J
  
  [Test Mv] remove infrt (#52270) · 551ff882
  由 jjyaoao 提交于 3月 30, 2023
  
  551ff882
- R
  
  Skip device transfer when arg-defs is set to Allbackend (#52294) · 54497c47
  由 Ruibiao Chen 提交于 3月 30, 2023
  
  54497c47
- [AMP OP&Test] assign op add fp16 、bfp16 test (#52233) · 41f0e3c3
  由 zhenhailiu 提交于 3月 30, 2023
```
* add fp16 bfp16 test

* polish

* polish

* polish
```
  41f0e3c3
- [AMP OP&Test] Arg min max bf16 test (#52276) · 3161e6c3
  由 zhenhailiu 提交于 3月 30, 2023
```
* polish

* add type check
```
  3161e6c3
- [AMP OP&Test] element_wise_add_fp16_test (#52240) · bed54a70
  由 zhenhailiu 提交于 3月 30, 2023
  
  bed54a70
- S
  [BugFix]Fix segment fault in order setting (#52293) · d2cdc7e3
  由 ShenLiang 提交于 3月 29, 2023
```
* fix bug in proto

* add utest
```
  d2cdc7e3
- D
  
  fix the compare in PD_MEA_CHECK_OVERFLOW (#52300) · 155018ee
  由 Danyang Zhang 提交于 3月 30, 2023
  
  155018ee
- L
  Change some op with xpu control (#52067) · 1faa06f0
  由 lzydev 提交于 3月 30, 2023
```
* change op with xpu

* change range yaml

* fix bug in generate_op.py
```
  1faa06f0
- G
  support python object input data broadcast for model parallel (#51765) · 8baf33a4
  由 Guoxia Wang 提交于 3月 30, 2023
```
* support python object input data broadcast for model parallel

* add unittest

* fix

* fix concat 0D tensor

* fix codestyle
```
  8baf33a4
- J
  [CINN] pass global seed to CINN (#52078) · 94aea284
  由 jiangcheng 提交于 3月 30, 2023
```
* [CINN] pass global seed to CINN

* fix cu not include cinn/runtime/flags.h bug

* fix DefaultCUDAGenerator should has device id bug
```
  94aea284
- C
  [CodeStyle][C416][C417] rewrite unnecessary comprehension with function call... · 929892c3
  由 cyberslack_lee 提交于 3月 30, 2023
```
[CodeStyle][C416][C417] rewrite unnecessary comprehension with function call and use generator instead of map (#52140)

* codestyle c416 c417

* fix error

* fix inc

* unify all C4 rules into one

* fix inc

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
```
  929892c3
- J
  
  [Test Mv] remove remaining tests in unittests/mlu(#52291) · b6ae6a5d
  由 jjyaoao 提交于 3月 30, 2023
  
  b6ae6a5d
- W
  Del old dygraph varbase (#52236) · d4571470
  由 wanghuancoder 提交于 3月 30, 2023
```
* delete old dygraph op test
```
  d4571470
- Y
  Add Gloo SendRecv Function (#52221) · b8850521
  由 yuehuayingxueluo 提交于 3月 30, 2023
```
* add gloo  send_recv

* fix code_stype

* fix CI bug

* fix send_recv.cc

* add send_recv without sync_op

* fix send_recv test

* fix gather.cc
```
  b8850521
29 3月, 2023 6 次提交

G

fix QAT export bug (#52218) · a523f6b3
由 Guanghua Yu 提交于 3月 29, 2023

a523f6b3

[AMP OP&Test] pad3d add unittests of fp16 and bf16 (#51015) · f86d0be7

由 zengshao0622 提交于 3月 29, 2023

* pad3d add unittests of fp16 and bf16

* pad3d add unittests of fp16 and bf16

* fix cuda place

* fix random to uniform

* fix class name

* fix fp16 max relative error to 1.5e-3

* add dytpe register for onednn

* add pad uint16 check of common.py

* remove check_eager

* test_check_grad --> test_check_grad_normal

f86d0be7

Z

[Test Mv] custom_runtime (#52021) · 7f86c1dc
由 Zheng-Bicheng 提交于 3月 29, 2023

7f86c1dc
J
Clear the infrt-related code (#52273) · da5a2584
由 jjyaoao 提交于 3月 29, 2023
```
* Clear the infrt-related code

* remove tools/infrt
```
da5a2584
C
Fix flashattn build error on jetson (#51665) · fb5910f4
由 chalsliu 提交于 3月 29, 2023
```
* Fix flashattn build error on jetson

* Fix nvcc not found on jetson
```
fb5910f4

Add group_norm composite rule (#51874) · cabf3921

由 Yichen Zhang 提交于 3月 29, 2023

* add group_norm composite rule

* add test for scale_grad and bias_grad

* resolve conflicts

* remove amp in composite_rule.py

* add float16 test

* deal with NHWC format

* keep the composite rule in float16 identical as original kernel

* resolve conflicts

cabf3921

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功