提交 · 0f9ec0133ac23d91fb7f2010d703dfa685ff82b1 · PaddlePaddle / Paddle

30 3月, 2023 40 次提交
- [Bug-fix] fix bug of Tensor.item() when CUDAPinnedPlace (#52322) · 0f9ec013
  由 zhouweiwei2014 提交于 3月 30, 2023
  
  0f9ec013
- W
  [AMP OP&Test] Transpose OP fp16 unitest (#52315) · f1cdd654
  由 Wang Xinyu 提交于 3月 30, 2023
```
* transpose fp16 test

* transpose auto tune fp16 test
```
  f1cdd654
- Z
  
  [Sparse]Fix the bug of elementwise_grad (#52102) · aeb8c2e2
  由 zhangkaihuo 提交于 3月 30, 2023
  
  aeb8c2e2
- Z
  
  [XPU] add delete_cast_op_pass (#52305) · 8b622d58
  由 zhupengyang 提交于 3月 30, 2023
  
  8b622d58
- Z
  [Move Test] Move prim (#52167) · 3e2d0195
  由 Zheng-Bicheng 提交于 3月 30, 2023
```
* update

* update
```
  3e2d0195
- K
  mv paddle/fluid/platform/device/xpu/tests 2 test/xpu/cpp (#52243) · bc5bae16
  由 Kim 提交于 3月 30, 2023
```
* mv paddle/fluid/platform/device/xpu/tests 2 test/xpu/cpp

* add missing cmake
```
  bc5bae16
- Z
  [Move Test] Move prim cpp (#52173) · a445466f
  由 Zheng-Bicheng 提交于 3月 30, 2023
```
* update

* update

* update
```
  a445466f
- F
  support complex data types for libpaddle.Tensor's element get and set (#52324) · 13b12457
  由 Feiyu Chan 提交于 3月 30, 2023
```
1. add type caster for paddle's complex type, to allow pybind to automatically cast it with python's complex type;
2. add complex64 and complex128 data type for `libpaddle.Tensor`'s element get and set(which is required to perturb an element to get the numerical derivative)
3. add support for cuda pinned place in `libpaddle.Tensor` element get and set

---
4. fix a bug in op code generation.(Creation of output folder in concurrent with parsing op yamls.)
```
  13b12457
- R
  
  [AMP OP&Test] add fp16 test for linspace (#52161) · 40b30f50
  由 Roc 提交于 3月 30, 2023
  
  40b30f50
- Y
  [AMP] Add python API for collecting operator stats. (#52215) · 73544322
  由 Yiqun Liu 提交于 3月 30, 2023
```
* [AMP] Add python API for collecting operator stats.

* Fix import and polish codes.

* Add more unittest.

* Add doc for the new APIs.
```
  73544322
- W
  add autogen code support for spectral_norm (#52145) · 28927209
  由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for spectral_norm

* bug fixed

* fix PR-CI-Static-Check fail
```
  28927209
- P
  Speedup worker (#51760) · 8ca86d72
  由 pangengzheng 提交于 3月 30, 2023
```
* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory
```
  8ca86d72
- Y
  
  adjust binding order (#52225) · 16ec22c4
  由 Yuanle Liu 提交于 3月 30, 2023
  
  16ec22c4
- V
  [AMP OP&Test]Modify the FP16 and BF16 OpTest of Add_N (#52311) · e3217e3e
  由 Vvsmile 提交于 3月 30, 2023
```
* adjust defalut tolerance of output and grad

* fix a bug in the grad of OpTest

* fix the type of setting defalut value in optest, both forward and
backward

* add defalut

* fix test_sum_op

* fix test_sum_op test for testing add_n

* modify the add_n op_test
```
  e3217e3e
- Z
  add scatter composite rule. (#52005) · e16eb22c
  由 zxcd 提交于 3月 30, 2023
```
* add scatter composite rule.

* add public_python_api

* add python unit16 support.

* fix code style.

* add cinn to makelist

* cinn unsupport uint16, forbidden cinn when dtype==uint16.
```
  e16eb22c
- Y
  
  add xpu cumprod, group norm grad (#52089) · fb16bdc7
  由 ykkk2333 提交于 3月 30, 2023
  
  fb16bdc7
- H
  register fluid kerenls to phi [part 1] (#52014) · 93d01787
  由 huangjiyi 提交于 3月 30, 2023
```
* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
```
  93d01787
- Z
  
  [XPU] add delete_concat_op_pass (#52304) · 70ebef81
  由 zhupengyang 提交于 3月 30, 2023
  
  70ebef81
- G
  Fix bug of c_softmax_with_cross_entropy_op_xpu_op (#52296) · 8ef97088
  由 Ghost Screaming 提交于 3月 30, 2023
```
* Support ignore_index for c_softmax_with_cross_entropy_op.

* Polish code. Remove useless comments and add Testcase.

* Polish code for TestCase.

* Polish code.

* Polish code style.

* Polish code.

* Change loss calculation formula and ignore_index dtype.

* Polish TestCase.

* Fix bug of c_softmax_with_cross_entropy_op_xpu_op. Attribute 'ignore_index'
dtype is int64_t.
```
  8ef97088
- 傅
  [AMP&OP_TEST] Fix interp test case (#52282) · dfa893fd
  由傅剑寒提交于 3月 30, 2023
```
* delete check_dygraph and use default atol,max_relative_error

* add test case for bicubic_interp
```
  dfa893fd
- Y
  [AMP OP&Test] Register FP16 for multinomial. (#52107) · 7788b65e
  由 yunyaoXYY 提交于 3月 30, 2023
```
* add FP16 for multinomial

* fix input data

* update code

* fix FP16

* fix code
```
  7788b65e
- F
  
  rename Scalar related utility functions(use CamelCase) (#52280) · e5a0dc31
  由 Feiyu Chan 提交于 3月 30, 2023
  
  e5a0dc31
- K
  [Perf] remove sync_calc_stream and sync_comm_stream (#51989) · 0f4229c5
  由 kangguangli 提交于 3月 30, 2023
```
* remove sync_calc_stream and sync_comm_stream

* fix ci bug

* fix

* fix

* fix
```
  0f4229c5
- A
  support auto generate for prelu (#51913) · d1c7b386
  由 Ainavo 提交于 3月 30, 2023
```
* support auto generate for prelu

* op_compat 中增加输入参数

* del attrs ; add kernel data_type

* add PreluGradInferMeta
```
  d1c7b386
- Z
  
  [AMP] use promote dtype when amp_level=O2 (#51063) · 6f8ab1fa
  由 Zhang Ting 提交于 3月 30, 2023
  
  6f8ab1fa
- W
  [AMP OP&Test] Strided slice fp16 and bf16 unitest (#52220) · 5cdd9f2c
  由 Wang Xinyu 提交于 3月 30, 2023
```
* stride slice fp16 and bf16 unitest

* fix code style

* add self.dtype
```
  5cdd9f2c
- G
  [Test Mv] ipu_test (#52143) · 38a477e2
  由 gouzil 提交于 3月 30, 2023
```
* [Test Mv] ipu_test

* [Test Mv] cmake add py_test_modules

* [Move Test] rm py_test_modules

* rm asp
```
  38a477e2
- R
  
  fix gcc12 error (#52318) · 77b7765f
  由 risemeup1 提交于 3月 30, 2023
  
  77b7765f
- G
  add autogen code support for sigmoid_cross_entropy_with_logits (#52263) · 710c13ed
  由 gouzil 提交于 3月 30, 2023
```
* add autogen code support for sigmoid_cross_entropy_with_logits

* add inplace
```
  710c13ed
- W
  add autogen code support for merge_selected_rows (#52274) · 6cd3575c
  由 Wang Xin 提交于 3月 30, 2023
```
* add autogen code support for merge_selected_rows

* bug fixed
```
  6cd3575c
- W
  force sync batch norm grad sequential (#52268) · 336160cf
  由 wanghuancoder 提交于 3月 30, 2023
```
* force sync batch norm grad sequential
```
  336160cf
- J
  
  [Test Mv] remove infrt (#52270) · 551ff882
  由 jjyaoao 提交于 3月 30, 2023
  
  551ff882
- R
  
  Skip device transfer when arg-defs is set to Allbackend (#52294) · 54497c47
  由 Ruibiao Chen 提交于 3月 30, 2023
  
  54497c47
- [AMP OP&Test] assign op add fp16 、bfp16 test (#52233) · 41f0e3c3
  由 zhenhailiu 提交于 3月 30, 2023
```
* add fp16 bfp16 test

* polish

* polish

* polish
```
  41f0e3c3
- [AMP OP&Test] Arg min max bf16 test (#52276) · 3161e6c3
  由 zhenhailiu 提交于 3月 30, 2023
```
* polish

* add type check
```
  3161e6c3
- [AMP OP&Test] element_wise_add_fp16_test (#52240) · bed54a70
  由 zhenhailiu 提交于 3月 30, 2023
  
  bed54a70
- S
  [BugFix]Fix segment fault in order setting (#52293) · d2cdc7e3
  由 ShenLiang 提交于 3月 29, 2023
```
* fix bug in proto

* add utest
```
  d2cdc7e3
- D
  
  fix the compare in PD_MEA_CHECK_OVERFLOW (#52300) · 155018ee
  由 Danyang Zhang 提交于 3月 30, 2023
  
  155018ee
- L
  Change some op with xpu control (#52067) · 1faa06f0
  由 lzydev 提交于 3月 30, 2023
```
* change op with xpu

* change range yaml

* fix bug in generate_op.py
```
  1faa06f0
- G
  support python object input data broadcast for model parallel (#51765) · 8baf33a4
  由 Guoxia Wang 提交于 3月 30, 2023
```
* support python object input data broadcast for model parallel

* add unittest

* fix

* fix concat 0D tensor

* fix codestyle
```
  8baf33a4

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功