提交 · 0b6dd5355e5327912c2476e167277618079d9fce · PaddlePaddle / Paddle

26 4月, 2023 13 次提交
- [Zero-Dim] distributed scatter/all_to_all support input 0D tensor (#53186) · 0b6dd535
  由 zhouweiwei2014 提交于 4月 26, 2023
  
  0b6dd535
- M
  【prim】scatter_nd_add_grad (#52469) · 55c4eb8a
  由 mhy-666 提交于 4月 26, 2023
```
* add scatter_nd_add comp

* add scatter_nd_add prim

* fix

* fix

* add public_python_api in TestScatterNdAddSimpleOp setup function

* fix composite_backward_api.h

* fix composite_backward

* add test cases

* fix composite_backward_api.h, unittest
```
  55c4eb8a
- R
  Fix fused_attention_op and fused_feedforward_op bugs in xpu (#53318) · 1164626c
  由 Ruibiao Chen 提交于 4月 26, 2023
```
* Fix fused_attention_op and fused_feedforward_op bugs in xpu

* Fix d_x alloc errors for fused_feedforward_grad_kernel
```
  1164626c
- G
  remove some [-Wunused-parameter] waring (#53319) · f9e5072b
  由 Galaxy1458 提交于 4月 26, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
  f9e5072b
- S
  Optimize c_embedding op in deterministic mode (#53197) · 35f5c245
  由 sneaxiy 提交于 4月 26, 2023
```
* optimize embedding deterministic mode

* fix compile error

* change FLAGS_cudnn_deterministic to int64

* fix 700 error

* add ut

* fix ut

* fix ut

* fix win32 ci

* fix flags with PHI_DEFINE_EXPORTED_int64
```
  35f5c245
- E
  
  [Debug][Werror]error: control reaches end of non-void function [-Werror=return-type](#53326) · 23e96bde
  由 engineer1109 提交于 4月 26, 2023
  
  23e96bde
- 陈
  
  remove *npu.cc (#53342) · b305629c
  由陈沧夜提交于 4月 26, 2023
  
  b305629c
- D
  
  【Hackathon No.48】为 Paddle determinant 算子实现 float16 数据类型支持 (#53286) · 2a705b74
  由 denglianbin 提交于 4月 26, 2023
  
  2a705b74
- D
  
  【Hackathon No.48】为 Paddle meshgrid 算子实现 float16 数据类型支持 (#53284) · 9127cc3c
  由 denglianbin 提交于 4月 26, 2023
  
  9127cc3c
- L
  [Bug Fixs] fix bugs when using cast<int64_t, int32_t> in xpu/cross_entropy... · 1d549400
  由 Lucas 提交于 4月 26, 2023
```
[Bug Fixs] fix bugs when using cast<int64_t, int32_t> in xpu/cross_entropy kernels, *test=kunlun (#53325)
```
  1d549400
- R
  Optimize prompt information (#53291) · 3ec12c2b
  由 risemeup1 提交于 4月 26, 2023
```
* Optimize prompt information

* add_information

* add_information
```
  3ec12c2b
- W
  
  add autogen code support for box_coder op (#53309) · ed040a17
  由 Wang Xin 提交于 4月 26, 2023
  
  ed040a17
- H
  Register fluid xpu kerenls to phi [part 3] (#53189) · 37489df5
  由 huangjiyi 提交于 4月 26, 2023
```
* update

* update
```
  37489df5
25 4月, 2023 17 次提交
- L
  Add singlely compile gpu kernel camke function (#53305) · af986bd5
  由 lzydev 提交于 4月 25, 2023
```
* support register single .cu file

* add register GPU kernel function
```
  af986bd5
- C
  
  update tile_grad composite rule (#53261) · dda6b9d5
  由 ccrrong 提交于 4月 25, 2023
  
  dda6b9d5
- S
  
  [XPU][BUG] Fix link_xpu_op_max_pass bug (#53258) · be1b3fc3
  由 sprouteer 提交于 4月 25, 2023
  
  be1b3fc3
- W
  
  add mp_sync config. (#53254) · 503f422e
  由 wuhuachaocoding 提交于 4月 25, 2023
  
  503f422e
- Y
  
  [Paddle Inference] add generic plugin for p_norm (#53278) · 00f747f2
  由 Yuanle Liu 提交于 4月 25, 2023
  
  00f747f2
- H
  Register fluid xpu kerenls to phi [part 1] (#53187) · f6f48780
  由 huangjiyi 提交于 4月 25, 2023
```
* update

* fix bug

* Revert "affine_channel_op"
```
  f6f48780
- Z
  【PaddlePaddle Hackathon 4 No.33】为 Paddle 优化 Histogram op 在 GPU 上的计算性能 (#53112) · c1a61fc0
  由 Zero Rains 提交于 4月 25, 2023
```
* create KernelMinMax to optimize the performance of histogram op in GPU

* change to block and warp wise operation

* remove the time in DtoH

* fix a bug
```
  c1a61fc0
- Y
  [PHI]Add flags macro for PHI (#52991) · 22e96bde
  由 YuanRisheng 提交于 4月 25, 2023
```
* add flags for phi

* fix compile bugs

* fix ci bugs

* fix inference bugs

* fix cinn' bugs

* fix cinn bugs

* perfect code according comment

* fix ci bugs

* fix ci bugs
```
  22e96bde
- C
  [DEBUG] print modifed flags (#53243) · 8d4b64e8
  由 Chitsing KUI 提交于 4月 25, 2023
```
* print modifed flags

* fix ref, opt print

* fix default getter

* fix ut
```
  8d4b64e8
- C
  
  【Hackathon No.61】min 算子FP16/BF16单测完善 (#52887) · d7a5e900
  由 cyberslack_lee 提交于 4月 25, 2023
  
  d7a5e900
- fix shared memory over usage in embedding grad kernel on deterministic mode (#53247) · 6f684bd2
  由 shaojie_wang 提交于 4月 25, 2023
```
* fix shared memory over usage in embedding grad kernel on determistic mode

* use IdT as interger dtype
```
  6f684bd2
- G
  
  test,test=develop (#53301) · bddeecd1
  由 Galaxy1458 提交于 4月 25, 2023
  
  bddeecd1
- Z
  
  tile op support 0D input for xpu (#53237) · 336bc20b
  由 zhangyikun02 提交于 4月 25, 2023
  
  336bc20b
- Z
  [Paddle-TRT] The Graph uses OpConverterType for op converter (#53214) · c7c5635e
  由 zhoutianzi666 提交于 4月 25, 2023
```
* add ```converter_type``` for op converter
```
  c7c5635e
- D
  【Hackathon No57】add fp16 & bf16 for max_pool2d_with_index, max_pool3d_with_index (#52314) · 46951224
  由 Difer 提交于 4月 25, 2023
```
* add fp_bf for pool_max_withidx

* fix some error

* fix error

* codestyle error

* fix masktype

* fix input bf type

* input bf dtype convert error

* back to convert input to bf16 first

* fix convert error

* fix bf16 grad check
```
  46951224
- B
  
  add syncthreads (#53149) · b7565222
  由 Bo Zhang 提交于 4月 25, 2023
  
  b7565222
- R
  
  Remove managed memory msg in cuda allocator (#53263) · 37a09539
  由 Ruibiao Chen 提交于 4月 25, 2023
  
  37a09539
24 4月, 2023 10 次提交
- [Zero-Dim] Support paddle.max output 0D, test=allcase (#53242) · 9f9cd919
  由 zhouweiwei2014 提交于 4月 24, 2023
  
  9f9cd919
- L
  
  fix dist_grad kernel (#53239) · ddd72039
  由 Leo Chen 提交于 4月 24, 2023
  
  ddd72039
- W
  
  fix 'Werror-maybe-uninitialized' compiler error in GCC 11.3 (#53246) · 21508090
  由 Wang Xin 提交于 4月 24, 2023
  
  21508090
- N
  
  Add "enable_tensor_checker" and "disable_tensor_checker" to api list (#52936) · 41138718
  由 niuliling123 提交于 4月 24, 2023
  
  41138718
- Y
  [Zero-Dim] support 0d tensor for shape and squeeze onednn kernel (#52832) · c0a604e7
  由 YangQun 提交于 4月 24, 2023
```
* support 0d tensor for shape and squeeze onednn kernel

* set python api for shape op ut
```
  c0a604e7
- Z
  Fix the calculation of layer_norm_bwd (#53224) · a0aff194
  由 Zhang Zheng 提交于 4月 24, 2023
```
* Fix the calculation of layer_norm_bwd

* fix
```
  a0aff194
- Z
  
  transform cachekv datalayout of fused_multi_transformer_xpu (#53144) · bfa5d6b8
  由 zhupengyang 提交于 4月 24, 2023
  
  bfa5d6b8
- Z
  
  fix compile bug of kps (#53251) · ae426b78
  由 zyfncg 提交于 4月 24, 2023
  
  ae426b78
- Y
  
  fix static_assert with no message (#53222) · 71474b10
  由 Yuanle Liu 提交于 4月 24, 2023
  
  71474b10
- T
  Mv eager ut (#53167) · adc2b745
  由 tianshuo78520a 提交于 4月 24, 2023
```
* Mv eager tests

* fix

* fix build error

* fix build error

* fix codestyle
```
  adc2b745

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功