提交 · 7cb4953941230dc109a094c6baefaaff7dda515c · PaddlePaddle / Paddle

28 4月, 2022 14 次提交
- Z
  Suppport more scenes for fused_fast_ln (#42282) · 7cb49539
  由 Zhang Zheng 提交于 4月 28, 2022
```
* Suppport more scenes for fused_fast_ln

* fix
```
  7cb49539
- W
  
  fix FusedResidualDropoutBias nan in v100 (#42344) · 687219fe
  由 WangXi 提交于 4月 28, 2022
  
  687219fe
- T
  Bfloat16 refactor (#42238) · 8ad38701
  由 Tomasz Socha 提交于 4月 28, 2022
```
* Refactor Quantization

* Refactor Dequantization

* Classy solution

* Style I

* Style II

* Style III

* Use VLOG(4) for debug info

* Style IV
```
  8ad38701
- W
  
  fix error report. (#42333) · afa846d9
  由 Wilber 提交于 4月 28, 2022
  
  afa846d9
- S
  Add gradient merge for DistributedFusedLamb optimizer (#40177) · 108aeb28
  由 sneaxiy 提交于 4月 28, 2022
```
* add gradient merge for DistributedFusedLamb

* use master acc gradient

* fix CI ut

* polish

* remove math_function_impl.h change

* fix test_update_loss_scaling_op.py

* try to fix XPU/NPU CI

* add gm ut
```
  108aeb28
- R
  
  [CustomDevice] add amp support (#42035) · acbb5dbe
  由 ronnywang 提交于 4月 28, 2022
  
  acbb5dbe
- L
  fix PIL sample mode deprecated warning (#42307) · c7a258fe
  由 LielinJiang 提交于 4月 28, 2022
```
* fix PIL sample mode deprecated warning

* compatible with old pil version
```
  c7a258fe
- L
  [KP] fix bug when phi kernel is *_raw (#42113) · 9fd2c546
  由 Liu-xiandong 提交于 4月 28, 2022
```
* [KP] fix bug when phi kernel is *_raw

* modify the static graph

* delete useless comment

* delete the phi multiply kernel case

* add VLOG(3) message

* add VLOG(3) message

* fix static graph error in phi

* fix bug in tranform model

* modify the comment

* delete useless code

* fix CI bug

* fix CI bug
```
  9fd2c546
- A
  [CustomDevice]change import way of unpublished file in op_test test=allcases (#42285) · 62c0304b
  由 Aganlengzi 提交于 4月 28, 2022
```
* test op_test test=allcases

* fix

* avoid copy many same file

* fix for win

* test PYTHONPATH

* change path adding way

* fix win

* use old way

* use old way test=allcase

* use old way test=allcase
```
  62c0304b
- W
  
  fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315) · f4507974
  由 WangXi 提交于 4月 28, 2022
  
  f4507974
- A
  
  [Performance]Add static inline for MakeReturnPyObject (#42334) · 2e1fb26b
  由 Aurelius84 提交于 4月 28, 2022
  
  2e1fb26b
- C
  
  polish attr get impl (#42337) · b972b0df
  由 Chen Weihang 提交于 4月 28, 2022
  
  b972b0df
- F
  set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (#42320) · 22d3c560
  由 FlyingQianMM 提交于 4月 28, 2022
```
* set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast

* fix code style
```
  22d3c560
- P
  fix collections.Sequence in python3.10 (#42242) · edb61a52
  由 pangyoki 提交于 4月 28, 2022
```
* fix collections.Sequence in python3.10

* fix format
```
  edb61a52
27 4月, 2022 22 次提交
- J
  Added missing test for shuffle_channel_mkldnn_detect_pass (#42001) · 5134f110
  由 jakpiase 提交于 4月 27, 2022
```
* added test for shuffle_channel_mkldnn_detect_pass

* added UT using new framework

* CI fix
```
  5134f110
- Z
  
  implement autotune python API (#42299) · 2094a584
  由 Zhang Ting 提交于 4月 27, 2022
  
  2094a584
- L
  
  fix gcc warning of [-Wint-in-bool-context] (#42268) · cf780097
  由 Leo Chen 提交于 4月 27, 2022
  
  cf780097
- P
  
  fix collections.Iterable in python3.10 (#42295) · 3d6fb260
  由 pangyoki 提交于 4月 27, 2022
  
  3d6fb260
- L
  Fix the race condition in cumsum operator (#42205) · 5d729457
  由 Leo Chen 提交于 4月 27, 2022
```
* Fix the race condition in cumsum operator

* Optimize cumsum operator
```
  5d729457
- Z
  
  fix bug (#42314) · 00ed8b57
  由 zhaocaibei123 提交于 4月 27, 2022
  
  00ed8b57
- S
  
  inplace addto (#42313) · 748d2ae0
  由 sneaxiy 提交于 4月 27, 2022
  
  748d2ae0
- Z
  Optimize performance of dygraph (v4) (#42196) · 37e2f027
  由 zyfncg 提交于 4月 27, 2022
```
* optimize performance of dygraph

* optimize performance of dygraph and elementwise_add

* optimize the trace op

* fix bug

* fix bug

* fix unittest bug

* fix code format
```
  37e2f027
- S
  fix test api problem (#42297) · ed1678aa
  由 seemingwang 提交于 4月 27, 2022
```
* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add dsm sample method

* add graph_neighbor_sample_v2

* Add graph_neighbor_sample_v2

* fix for loop

* add cpu sample interface

* fix kernel judgement

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* change index settings

* recover test

* recover test

* fix spelling

* recover

* fix

* move cudamemcpy after cuda stream sync

* fix linking problem

* remove comment

* add cpu test

* test

* add cpu test

* change comment

* combine feature table and graph table

* test

* test

* pybind

* test

* test

* test

* test

* pybind

* pybind

* fix cmake

* pybind

* fix

* fix

* add pybind

* add pybind

* optimize pybind

* test

* fix pybind

* fix

* pybind change

* remove file
Co-authored-by: NDesmonDay <908660116@qq.com>
```
  ed1678aa
- T
  
  fix sparse csr (#42271) · b9bfcf14
  由 tiancaishaonvjituizi 提交于 4月 27, 2022
  
  b9bfcf14
- Z
  
  Delete api from __all__ (#42220) · d1e01232
  由 Zhang Zheng 提交于 4月 27, 2022
  
  d1e01232
- A
  [CustomDevice] op_test supports custom device (#42227) · 4df02fdf
  由 Aganlengzi 提交于 4月 27, 2022
```
* [DO NOT MERGE] test op_test

* update with more related modifications

* split op_test.py to use test=allcases for testing

* split op_test.py to use test=allcases for testing
```
  4df02fdf
- C
  Unify utils naming style (#42264) · 2cebcf4a
  由 Chen Weihang 提交于 4月 27, 2022
```
* unify utils naming style

* polish details
```
  2cebcf4a
- Y
  
  Adjust the relative error of QR's grad (#42221) · 4c80385a
  由 Yulong Ao 提交于 4月 27, 2022
  
  4c80385a
- Q
  
  [MLU]add dropout op (#42274) · acca0352
  由 qipengh 提交于 4月 27, 2022
  
  acca0352
- L
  
  add the support for allreduce_prod for new dygraph (#42284) · 89951472
  由 lilong12 提交于 4月 27, 2022
  
  89951472
- R
  Fix paddle setup (#42254) · 8395d660
  由 Roc 提交于 4月 27, 2022
```
* expose api

* ref clipgradbynorm

* update

* Update __init__.py
```
  8395d660
- fix multinomial paddle_enforce bug (#42302) · 31c33122
  由 zhouweiwei2014 提交于 4月 27, 2022
  
  31c33122
- Z
  Add move construct for KernelSignature (#42253) · e5a0365b
  由 zyfncg 提交于 4月 27, 2022
```
* add move construct for KernelSignature

* add noexcept
```
  e5a0365b
- fix randperm out of bound bug (#42057) · a6794926
  由 zhouweiwei2014 提交于 4月 27, 2022
  
  a6794926
- P
  
  support python3.10 in paddle_build (#42207) · b20683c0
  由 pangyoki 提交于 4月 27, 2022
  
  b20683c0
- C
  
  opt attr eaque perf (#42272) · ca909408
  由 Chen Weihang 提交于 4月 27, 2022
  
  ca909408
26 4月, 2022 4 次提交

Q
support nhwc format for kunlun conv/batch_norm (#42195) · 88d68c08
由 QingshuChen 提交于 4月 26, 2022
```
* support nhwc format for kunlun conv/batch_norm
*test=kunlun

* minor
*test=kunlun
```
88d68c08

【PaddlePaddle Hackathon 2】29、为 Paddle 新增 PixelUnshuffle 组网 API (#40728) · 5be9b824

由 BrilliantYuKaimin 提交于 4月 26, 2022

* 增加PixelUnshuffle的形状推断

* 增加PixelUnshuffle的算子注册

* 增加PixelUnshuffle及其梯度的核函数

* 增加PixelUnshuffle算子的描述

* 增加PixelUnshuffle算子的签名

* 在Python层面增加PixelUnshuffle

* 增加PixelUnshuffle的单测

* Update test_pixel_unshuffle.py

* test=document_fix

* Update test_pixel_unshuffle.py

增加对extra_repr的测试

* 修正代码格式

* Update test_pixel_unshuffle.py

修正对extra_repr的测试

* 修改pixel_unshuffle核函数的实现位置

* 修正代码格式

* 完善对输入的检查

* Update test_pixel_unshuffle.py

* 完善pixel_unshuffle的输入检查

* Update pixel_unshuffle_op.cc

* Update unary.cc

* add pixel_unshuffle

* Update test_pixel_unshuffle.py

* Update vision.py

* 调整代码格式

* Update vision.py

* Delete extra spaces

* Update pixel_unshuffle_sig.cc

* Update vision.py

* Update vision.py

* add PixelUnshuffleGradInferMeta

* remove PixelUnshuffleOpArgumentMapping

* Update pixel_unshuffle_op.cc

* 调整pixel_unshuffle及其梯度的核函数的实现位置

* Update pixel_unshuffle_op.cc

5be9b824

S

range can not return shape when enable_static (#42275) · 3cdc7a01
由 ShiningZhang 提交于 4月 26, 2022

3cdc7a01
C

add attr type test (#42263) · eb64983a
由 Chen Weihang 提交于 4月 26, 2022

eb64983a

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功