提交 · 0b39b244f1567f5fb8dc89e888ded57f5daf792c · BaiXuePrincess / Paddle

17 10月, 2022 21 次提交

Support BF16 training for sharding (#46846) · 0b39b244

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

0b39b244

H
Revert "add common subexpression elimination (#44386)" (#47062) · 7c6835ca
由 hong 提交于 10月 17, 2022
```
This reverts commit 166ff39a.
```
7c6835ca
O

delete maybe unused code in paddle\phi\infermeta\sparse\unary.h (#46844) · 776e80a6
由 OccupyMars2025 提交于 10月 17, 2022

776e80a6
Y
[PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
ec749398
O
[hidden trouble] Update test_sparse_transpose_op.py to get rid of a hidden trouble. (#47017) · d43c972c
由 OccupyMars2025 提交于 10月 17, 2022
```
* Update test_sparse_transpose_op.py

* Update test_sparse_transpose_op.py
```
d43c972c
R

Fix warning message format error (#47045) · 13284437
由 RedContritio 提交于 10月 17, 2022

13284437

【Hackathon No.8】 add gumbel distribution api (#46255) · f1a9f877

由 YuRonan 提交于 10月 17, 2022

* init gumbel api

* commit: update test file

* fix：bug

* update Gumbel API

* upgrade distribution/gumbel.py

* add tests/test_distribution_gumbel.py

* fix：code style

* fix：code style

* fix：code style

* fix：code style

* fix bug

* fix：code style

* fix：code style

* fix：rollback uniform

* fix：delete invalid code

* fix：bug and add static test

* fix：code style

* fix：code style

* fix：delete init transforms

* fix：bug

* fix：bug

* fix：code style

* fix：code style

* fix：add transforms

* fix：code style

* fix：code style

* fix：bug

* fix：bug

* fix：code style

* fix：code style

* fix：bug

* fix：code style

* fix：code style

* fix：bug for gumbel.py / add：judge transforms'len for transformed_distribution.py

* update gumbel.py

* fix：bug for test_distribution_gumbel.py

* fix：bug for test_distribution_gumbel_static.py

* fix：code style

* fix：code style

* fix：coverage

* fix：bug

* fix：bug

* fix：code style

* fix：bug

* delete：no use package for gumbel.py

* add：coverage transforms'len judge for test_distribution_gumbel.py

* fix：code style for test_distribution_gumbel.py

* fix：coverage

* fix：code style

* fix：code style

* fix：code style

* fix：code style

* fix：code style

* fix：en doc

* fix：param

* fix：copyright

* fixSample; test=document_fix
Co-authored-by: Ndasen <sen15530876201@163.com>

f1a9f877

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694) · abb38136

由 OccupyMars2025 提交于 10月 17, 2022

* add sparse reshape

* change the dtype in all test cases to int64

* just one test case

* modify comments

* Update test_sparse_reshape_op.py

* chang the type of "shape"  from  vector<int64_t>  to  IntArray

* check whether sp_out.to_dense() is the cause  of error

* print sp_out

* Update reshape_kernel.cc

* use numpy to generate the equal paddle tensor

* just check dense_tensor.numpy()

* check cpu and cuda versions

* Update test_sparse_reshape_op.py

* supply all test cases for cpu forward coo kernel

* test forward coo cuda kernel

* change configuration of cuda kernel

* keep only one test case

* test coo cpu kernel (forward and backward)

* row major or column major ???

* test cuda coo forward kernel

* complete declaration and registration

* Update __init__.py

* rebuild

* retrigger CI

* add cudaMalloc and cudaMemcpy  in  ReshapeCooKernel  and change back to row major order in a cuda dense tensor

* midify minor error

* test only cpu coo forward kernel

* add all test cases for coo forward kernel  (both cpu and gpu)

* test all forward kernels (coo, csr; cpu, gpu)

* add all test cases for all kinds of kernels

* just retrigger CI

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* resolve conflicts

* Update sparse_ops.yaml

* don't specify tensor place

* new shape has -1 or 0 in it

* Update unary_grad_kernel.h

* correct lvalue error

* code style

* Update sparse_backward.yaml

* Update sparse_ops.yaml

* Update unary_kernel.h

* Update unary.py

* Update sparse_backward.yaml

* Update unary.py

* code style

* code style

* code style

* Update unary.py

* specify tensor place explicitly

* do not use numpy array

* use numpy array in unit test again

* modify example code in docstring

abb38136

W

support __floordiv__ (#47060) · 64307903
由 Weilong Wu 提交于 10月 17, 2022

64307903

Layernorm shift partition enhance (#46816) · 9e08633c

由 Wang Bojun 提交于 10月 17, 2022

* first version of ln_s_p with s>0

* refine and UT

* pass opt draft

* pass opt

* code refine

* code-style

* bug fix

* fix ci test

* code style

9e08633c

Y
[Auto Parallel] Fix the bug of completion (#47056) · f0af2708
由 Yulong Ao 提交于 10月 17, 2022
```
* [Auto Parallel] Fix the bug for None labels

* [Auto Parallel] Fix the completion bug
```
f0af2708
J

fix for conv_bias_mkldnn_pass (#47037) · acbda3e4
由 jakpiase 提交于 10月 17, 2022

acbda3e4

skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr (#46911) · 2e7dc666

由 pangyoki 提交于 10月 17, 2022

* skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr

* update ut

* test_dist_allreduce_op failed

* fix test_dist_allreduce_op

* add ut

* fix nccl cpu compile

* fix

2e7dc666

[CodeStyle][py2] remove `compat` module (to_bytes) (#47035) · 198c7993

由 Nyakku Shigure 提交于 10月 17, 2022

* [CodeStyle][py2] remove `compat` module (to_bytes)

* remove some unused imports

* clean up to_bytes definition and unittests

* Revert "clean up to_bytes definition and unittests"

This reverts commit e726539e1768172a411ff60e63fab82f164343cf.

* use `b` prefix instead of `encode()`

198c7993

L
Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mode. (#47004) · f9c1cdc1
由 Leo Guo 提交于 10月 17, 2022
```
test=kunlun
```
f9c1cdc1
G

fix dygraph new format problem export in QAT (#47023) · 6566b8f5
由 Guanghua Yu 提交于 10月 17, 2022

6566b8f5
G

fix unittest test_post_training_quantization_lstm_model problem (#47024) · f4ea771d
由 Guanghua Yu 提交于 10月 17, 2022

f4ea771d
J

add shape info into eager log (#46934) · 74938395
由 Jiabin Yang 提交于 10月 17, 2022

74938395
H

fix typo error in operator.cc (#46995) · 328236d2
由 HongyuJia 提交于 10月 17, 2022

328236d2
W

[Eager] use CastPyArg2Double to parse python float obj (#47029) · b4a1f43f
由 Weilong Wu 提交于 10月 17, 2022

b4a1f43f
D
[Custom Device] Add singleton to custom device (#46963) · 73196e5a
由 duanyanhui 提交于 10月 17, 2022
```
* add singleton to custom device

* Update custom_device.cc

Init device_init_flag_ in default
```
73196e5a

16 10月, 2022 1 次提交
- Z
  
  add common subexpression elimination (#44386) · 166ff39a
  由 ZeKai Zhou 提交于 10月 16, 2022
  
  166ff39a
15 10月, 2022 1 次提交
- H
  
  delete GetExpectedKernelType mkldnn of transpose2 (#46977) · 64b61fc4
  由 HongyuJia 提交于 10月 15, 2022
  
  64b61fc4
14 10月, 2022 13 次提交
- C
  Simplify conv_mkldnn op registration (#46907) · eded6013
  由 Chen Weihang 提交于 10月 14, 2022
```
* simplify conv_mkldnn op registration

* remove custom type value in conv grad op
```
  eded6013
- W
  
  Fix collective APIs cannot be recognized when building docs (#46962) · 2010bdc3
  由 Wen Sun 提交于 10月 14, 2022
  
  2010bdc3
- J
  Update imread() read image way error (#47005) · a1b99978
  由 jingsongliu 提交于 10月 14, 2022
```
* update test_image.py

* update test_image.py
```
  a1b99978
- Z
  [AutoParallel] adapt for gpt-gen (#46771) · 31a437b1
  由 zhaoyingli 提交于 10月 14, 2022
```
* for gpt-gen

* fix reshard

* adapt assign and shape op

* add dist_assign & unittest

* add conditional block unittest

* rename unittest
```
  31a437b1
- R
  
  speed_up for deformable conv (#46997) · eee6b3a7
  由 Rayman 提交于 10月 14, 2022
  
  eee6b3a7
- P
  Fix hAPI bug of not compatible with LayerHook (#47001) · a6a2618e
  由 parap1uie-s 提交于 10月 14, 2022
```
* Fix hAPI bug of not compatible with LayerHook
```
  a6a2618e
- W
  TRT pool2d adaptive mode bugfix (#46802) · eb32746a
  由 Wang Bojun 提交于 10月 14, 2022
```
* draft with debug print
```
  eb32746a
- W
  
  remove BackendType in inference api. (#46942) · eb429936
  由 Wilber 提交于 10月 14, 2022
  
  eb429936
- Z
  
  [inference][trt] fix reshape2 opteller and elementwise min/max trt registration (#46861) · 2f9de5f3
  由 Zhang Jun 提交于 10月 14, 2022
  
  2f9de5f3
- Y
  
  [Auto Parallel] Fix the bug for None labels (#46987) · 974e98bc
  由 Yulong Ao 提交于 10月 14, 2022
  
  974e98bc
- W
  Add more record event in run program op (#46949) · 48bb2c0a
  由 WangZhen 提交于 10月 14, 2022
```
* Add more record event in run program op

* Refine code

* Restore code

* Rename event
```
  48bb2c0a
- S
  
  Update distributed_strategy.proto (#46531) · fcdc6777
  由 Shijie 提交于 10月 14, 2022
  
  fcdc6777
- L
  
  add athor (#46994) · cba0020b
  由 Leo Chen 提交于 10月 14, 2022
  
  cba0020b
13 10月, 2022 4 次提交

S

[geometric] Add unittest for send_uv (#46948) · f6ae9fb9
由 Siming Dai 提交于 10月 13, 2022

f6ae9fb9
Z
[Phi] Refactor logic of judging whether having a phi kernrel (#46920) · 8d797fd2
由 zyfncg 提交于 10月 13, 2022
```
* refind logic of choose phi kernrel

* fix complie budg
```
8d797fd2

Fix quantize model deploy bugs when using MKLDNN (#45920) · 561fd8c8

由 yeliang2258 提交于 10月 13, 2022

* fix immutable op quantize bugs

* fix

* fix build bug

* fix test

* notest,test=inference

* fix ppyoloe acc drop bugs

* fix test

* fix test

* add test

* fix

* fix

* fix test

* fix refined name bug

* fix test

* bias fix

* fix matmul weight dequant bug

* re-ci

* fix tester

* fix test

* fix tester

* update weight dequantize func

* update code

* update test for converage

* update test

* update cmake

* update cmakelist

* update code

* rerun ci

* remove useless code

561fd8c8

X

logsumexp support fp16 (#45817) · 910e1b6a
由 xiaohemaikoo 提交于 10月 13, 2022

910e1b6a

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致