提交 · 4137c46e7a79e3b23b5d5dda23bba6684aa29fce · PaddlePaddle / Paddle

26 10月, 2022 3 次提交
- W
  fix uninitialized, tautological-constant-out-of-range-compare and... · 076c41ef
  由 Wang Xin 提交于 10月 26, 2022
```
fix uninitialized, tautological-constant-out-of-range-compare and literal-conversion warning on macos (#47341)
```
  076c41ef
- Z
  Fix inference performance problem caused by selecting cudnn kernel of softmax (#47338) · dfe6d8fa
  由 zyfncg 提交于 10月 26, 2022
```
* fix inference perfermence problem caused by selecting cudnn kernel for softmax

* recover use_cudnn in opmaker of softmax
```
  dfe6d8fa
- C
  
  clean useless api tests in phi (#47321) · c334405f
  由 Chen Weihang 提交于 10月 25, 2022
  
  c334405f
25 10月, 2022 4 次提交
- J
  Added workaround for elementwise oneDNN kernel (#47080) · 0abf7560
  由 jakpiase 提交于 10月 25, 2022
```
* return proper state

* fix for dims

* fix
```
  0abf7560
- J
  
  minor split optimization (#47314) · d5e7d20d
  由 jakpiase 提交于 10月 25, 2022
  
  d5e7d20d
- W
  
  fix braced-scalar-init warnings on macos (#47309) · d8690564
  由 Wang Xin 提交于 10月 25, 2022
  
  d8690564
- [Zero-Dim] support input 0D Tensor for softmax/log_softmax/gumbel_softmax (#47251) · ac3b882f
  由 zhouweiwei2014 提交于 10月 25, 2022
  
  ac3b882f
24 10月, 2022 5 次提交
- Z
  Polish slice code in fluid (#45746) · 3f64a2c3
  由 zyfncg 提交于 10月 24, 2022
```
* support selected_rows kernel for multiply in dygraph

* delete useless code of slice in fluid

* fix complie bug

* move slice_array from fluid to phi

* fix strided_slice_op_npu
```
  3f64a2c3
- Z
  [code-gen] Generate static graph code for exp op (#47120) · 5b1dd387
  由 zyfncg 提交于 10月 24, 2022
```
* gene static graph code for exp

* refactor the doc of exp

* fix bug

* fix bug

* update doc of exp

* fix sparse op
```
  5b1dd387
- Y
  
  Enhance the implementation of some conv functions. (#47281) · bc47e7ac
  由 Yiqun Liu 提交于 10月 24, 2022
  
  bc47e7ac
- Z
  
  fix cumsum compilation error for GPU architecture that does not support fast FP16 (#47277) · 84273aaa
  由 Zhang Ting 提交于 10月 24, 2022
  
  84273aaa
- Y
  
  Move the header file of conv cudnn and miopen to phi directory. (#47248) · 31f57f29
  由 Yiqun Liu 提交于 10月 24, 2022
  
  31f57f29
23 10月, 2022 1 次提交
- N
  [CodeStyle][black] use black instead of yapf (#46014) · 7097630f
  由 Nyakku Shigure 提交于 10月 23, 2022
```
* update config

* re-blacken python code

* temporarily disable date and diff_py_file

* skip a format
```
  7097630f
21 10月, 2022 2 次提交
- Z
  
  fix bug of abs_grad in eager mode for kunlun, test=kunlun (#47164) · a9ac608f
  由 zhangyikun02 提交于 10月 21, 2022
  
  a9ac608f
- L
  Fix the bug where the device memory address appears in abs_grad kernel... · 43ad0b17
  由 Leo Guo 提交于 10月 21, 2022
```
Fix the bug where the device memory address appears in abs_grad kernel fallback to CPU. test=kunlun (#47186)
```
  43ad0b17
20 10月, 2022 4 次提交
- Z
  [Sparse] Fix indices (#47190) · 0e1b6144
  由 zhangkaihuo 提交于 10月 20, 2022
```
* fix indices
```
  0e1b6144
- J
  Add infer prune function (#47046) · af9486fc
  由 JingZhuangzhuang 提交于 10月 20, 2022
```
* Add infer prune function

* Update phi.cmake

* Update operators.cmake

* add fusion op
```
  af9486fc
- T
  
  PaddlePaddle Hackathon 3 No.45 & 46】：为 Paddle cumsum和logcumsumexp 支持 float16 数据类型 (#45952) · c91b1b91
  由 thunder95 提交于 10月 20, 2022
  
  c91b1b91
- Z
  
  fix sparse inplace (#47167) · b9e6b94d
  由 zhangkaihuo 提交于 10月 20, 2022
  
  b9e6b94d
19 10月, 2022 6 次提交
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- Z
  Rename name of op and op_args in yaml to align python api (#46343) · 85489d39
  由 zyfncg 提交于 10月 19, 2022
```
* rename op in yaml

* fix test_layout_autotune

* fix layout autotune of transpose
```
  85489d39
- C
  
  remove fluid symbol depend in sync bn (#47122) · ab369976
  由 Chen Weihang 提交于 10月 19, 2022
  
  ab369976
- Y
  Enable to record whether the conv algo is got by exhaustive search to fix... · 3bc4b850
  由 Yiqun Liu 提交于 10月 19, 2022
```
Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
```
  3bc4b850
- W
  
  slice op supports uint8_t (#47067) · 1e1c7275
  由 will-jl944 提交于 10月 19, 2022
  
  1e1c7275
- X
  [Dy2Static] Remove GradTransformer (#47063) · be3908a3
  由 xiongkun 提交于 10月 19, 2022
```
* [Dy2Static] Remove GradTransformer
1. fix einsum infershape bugs.
2. remove grad_transformer and unify paddle.grad and paddle.static.gradient.
3. add dygraph_and_dy2static_only decorator for dy2static.

* fix bugs

* rename
```
  be3908a3
18 10月, 2022 5 次提交
- [Zero-Dim] support 0D Tensor for reshape/create_parameters (#47074) · 35d5db36
  由 zhouweiwei2014 提交于 10月 18, 2022
  
  35d5db36
- S
  add embedding range check (#46991) · d68c38ef
  由 seemingwang 提交于 10月 18, 2022
```
* add embedding range check

* change head file

* change head file

* fix
```
  d68c38ef
- L
  
  Add value check & error message for gather_tree (#47051) · e5e3d5cf
  由 liu zhengxi 提交于 10月 18, 2022
  
  e5e3d5cf
- H
  [XPU] update xpu cmake to 1016. test=kunlun (#47041) · 55ac9c46
  由 houj04 提交于 10月 18, 2022
```
* [XPU] update xpu cmake to 1016. test=kunlun

* fix special case of transpose op. test=kunlun
```
  55ac9c46
- Z
  [code-gen] Support code-gen for opmaker of sparse op (#46993) · bdd3dde3
  由 zyfncg 提交于 10月 18, 2022
```
* support generating code of opmaker for backward op invoke forward op

* gsupport code-gen of opmaker for sparse op

* refind logic of choose phi kernrel

* fix complie budg

* fix code_gen bug

* fix bug

* fix kernel signature code-gen

* fix complie bug of VarType

* fix complie bug of VarType

* fix test_sparse_conv_op

* fix test_sparse_norm_op
```
  bdd3dde3
17 10月, 2022 7 次提交

Support BF16 training for sharding (#46846) · 0b39b244

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

0b39b244

O

delete maybe unused code in paddle\phi\infermeta\sparse\unary.h (#46844) · 776e80a6
由 OccupyMars2025 提交于 10月 17, 2022

776e80a6
Y
[PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
ec749398
R

Fix warning message format error (#47045) · 13284437
由 RedContritio 提交于 10月 17, 2022

13284437

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694) · abb38136

由 OccupyMars2025 提交于 10月 17, 2022

* add sparse reshape

* change the dtype in all test cases to int64

* just one test case

* modify comments

* Update test_sparse_reshape_op.py

* chang the type of "shape"  from  vector<int64_t>  to  IntArray

* check whether sp_out.to_dense() is the cause  of error

* print sp_out

* Update reshape_kernel.cc

* use numpy to generate the equal paddle tensor

* just check dense_tensor.numpy()

* check cpu and cuda versions

* Update test_sparse_reshape_op.py

* supply all test cases for cpu forward coo kernel

* test forward coo cuda kernel

* change configuration of cuda kernel

* keep only one test case

* test coo cpu kernel (forward and backward)

* row major or column major ???

* test cuda coo forward kernel

* complete declaration and registration

* Update __init__.py

* rebuild

* retrigger CI

* add cudaMalloc and cudaMemcpy  in  ReshapeCooKernel  and change back to row major order in a cuda dense tensor

* midify minor error

* test only cpu coo forward kernel

* add all test cases for coo forward kernel  (both cpu and gpu)

* test all forward kernels (coo, csr; cpu, gpu)

* add all test cases for all kinds of kernels

* just retrigger CI

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* resolve conflicts

* Update sparse_ops.yaml

* don't specify tensor place

* new shape has -1 or 0 in it

* Update unary_grad_kernel.h

* correct lvalue error

* code style

* Update sparse_backward.yaml

* Update sparse_ops.yaml

* Update unary_kernel.h

* Update unary.py

* Update sparse_backward.yaml

* Update unary.py

* code style

* code style

* code style

* Update unary.py

* specify tensor place explicitly

* do not use numpy array

* use numpy array in unit test again

* modify example code in docstring

abb38136

L
Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mode. (#47004) · f9c1cdc1
由 Leo Guo 提交于 10月 17, 2022
```
test=kunlun
```
f9c1cdc1
D
[Custom Device] Add singleton to custom device (#46963) · 73196e5a
由 duanyanhui 提交于 10月 17, 2022
```
* add singleton to custom device

* Update custom_device.cc

Init device_init_flag_ in default
```
73196e5a

14 10月, 2022 2 次提交
- R
  
  speed_up for deformable conv (#46997) · eee6b3a7
  由 Rayman 提交于 10月 14, 2022
  
  eee6b3a7
- W
  TRT pool2d adaptive mode bugfix (#46802) · eb32746a
  由 Wang Bojun 提交于 10月 14, 2022
```
* draft with debug print
```
  eb32746a
13 10月, 2022 1 次提交
- Z
  [Phi] Refactor logic of judging whether having a phi kernrel (#46920) · 8d797fd2
  由 zyfncg 提交于 10月 13, 2022
```
* refind logic of choose phi kernrel

* fix complie budg
```
  8d797fd2

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功