提交 · bc47e7ac2c63e2ba12984f71a82f2d312eb6ef18 · PaddlePaddle / Paddle

24 10月, 2022 3 次提交
- Y
  
  Enhance the implementation of some conv functions. (#47281) · bc47e7ac
  由 Yiqun Liu 提交于 10月 24, 2022
  
  bc47e7ac
- Z
  
  fix cumsum compilation error for GPU architecture that does not support fast FP16 (#47277) · 84273aaa
  由 Zhang Ting 提交于 10月 24, 2022
  
  84273aaa
- Y
  
  Move the header file of conv cudnn and miopen to phi directory. (#47248) · 31f57f29
  由 Yiqun Liu 提交于 10月 24, 2022
  
  31f57f29
23 10月, 2022 1 次提交
- N
  [CodeStyle][black] use black instead of yapf (#46014) · 7097630f
  由 Nyakku Shigure 提交于 10月 23, 2022
```
* update config

* re-blacken python code

* temporarily disable date and diff_py_file

* skip a format
```
  7097630f
21 10月, 2022 2 次提交
- Z
  
  fix bug of abs_grad in eager mode for kunlun, test=kunlun (#47164) · a9ac608f
  由 zhangyikun02 提交于 10月 21, 2022
  
  a9ac608f
- L
  Fix the bug where the device memory address appears in abs_grad kernel... · 43ad0b17
  由 Leo Guo 提交于 10月 21, 2022
```
Fix the bug where the device memory address appears in abs_grad kernel fallback to CPU. test=kunlun (#47186)
```
  43ad0b17
20 10月, 2022 4 次提交
- Z
  [Sparse] Fix indices (#47190) · 0e1b6144
  由 zhangkaihuo 提交于 10月 20, 2022
```
* fix indices
```
  0e1b6144
- J
  Add infer prune function (#47046) · af9486fc
  由 JingZhuangzhuang 提交于 10月 20, 2022
```
* Add infer prune function

* Update phi.cmake

* Update operators.cmake

* add fusion op
```
  af9486fc
- T
  
  PaddlePaddle Hackathon 3 No.45 & 46】：为 Paddle cumsum和logcumsumexp 支持 float16 数据类型 (#45952) · c91b1b91
  由 thunder95 提交于 10月 20, 2022
  
  c91b1b91
- Z
  
  fix sparse inplace (#47167) · b9e6b94d
  由 zhangkaihuo 提交于 10月 20, 2022
  
  b9e6b94d
19 10月, 2022 6 次提交
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- Z
  Rename name of op and op_args in yaml to align python api (#46343) · 85489d39
  由 zyfncg 提交于 10月 19, 2022
```
* rename op in yaml

* fix test_layout_autotune

* fix layout autotune of transpose
```
  85489d39
- C
  
  remove fluid symbol depend in sync bn (#47122) · ab369976
  由 Chen Weihang 提交于 10月 19, 2022
  
  ab369976
- Y
  Enable to record whether the conv algo is got by exhaustive search to fix... · 3bc4b850
  由 Yiqun Liu 提交于 10月 19, 2022
```
Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
```
  3bc4b850
- W
  
  slice op supports uint8_t (#47067) · 1e1c7275
  由 will-jl944 提交于 10月 19, 2022
  
  1e1c7275
- X
  [Dy2Static] Remove GradTransformer (#47063) · be3908a3
  由 xiongkun 提交于 10月 19, 2022
```
* [Dy2Static] Remove GradTransformer
1. fix einsum infershape bugs.
2. remove grad_transformer and unify paddle.grad and paddle.static.gradient.
3. add dygraph_and_dy2static_only decorator for dy2static.

* fix bugs

* rename
```
  be3908a3
18 10月, 2022 5 次提交
- [Zero-Dim] support 0D Tensor for reshape/create_parameters (#47074) · 35d5db36
  由 zhouweiwei2014 提交于 10月 18, 2022
  
  35d5db36
- S
  add embedding range check (#46991) · d68c38ef
  由 seemingwang 提交于 10月 18, 2022
```
* add embedding range check

* change head file

* change head file

* fix
```
  d68c38ef
- L
  
  Add value check & error message for gather_tree (#47051) · e5e3d5cf
  由 liu zhengxi 提交于 10月 18, 2022
  
  e5e3d5cf
- H
  [XPU] update xpu cmake to 1016. test=kunlun (#47041) · 55ac9c46
  由 houj04 提交于 10月 18, 2022
```
* [XPU] update xpu cmake to 1016. test=kunlun

* fix special case of transpose op. test=kunlun
```
  55ac9c46
- Z
  [code-gen] Support code-gen for opmaker of sparse op (#46993) · bdd3dde3
  由 zyfncg 提交于 10月 18, 2022
```
* support generating code of opmaker for backward op invoke forward op

* gsupport code-gen of opmaker for sparse op

* refind logic of choose phi kernrel

* fix complie budg

* fix code_gen bug

* fix bug

* fix kernel signature code-gen

* fix complie bug of VarType

* fix complie bug of VarType

* fix test_sparse_conv_op

* fix test_sparse_norm_op
```
  bdd3dde3
17 10月, 2022 7 次提交

Support BF16 training for sharding (#46846) · 0b39b244

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

0b39b244

O

delete maybe unused code in paddle\phi\infermeta\sparse\unary.h (#46844) · 776e80a6
由 OccupyMars2025 提交于 10月 17, 2022

776e80a6
Y
[PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
ec749398
R

Fix warning message format error (#47045) · 13284437
由 RedContritio 提交于 10月 17, 2022

13284437

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694) · abb38136

由 OccupyMars2025 提交于 10月 17, 2022

* add sparse reshape

* change the dtype in all test cases to int64

* just one test case

* modify comments

* Update test_sparse_reshape_op.py

* chang the type of "shape"  from  vector<int64_t>  to  IntArray

* check whether sp_out.to_dense() is the cause  of error

* print sp_out

* Update reshape_kernel.cc

* use numpy to generate the equal paddle tensor

* just check dense_tensor.numpy()

* check cpu and cuda versions

* Update test_sparse_reshape_op.py

* supply all test cases for cpu forward coo kernel

* test forward coo cuda kernel

* change configuration of cuda kernel

* keep only one test case

* test coo cpu kernel (forward and backward)

* row major or column major ???

* test cuda coo forward kernel

* complete declaration and registration

* Update __init__.py

* rebuild

* retrigger CI

* add cudaMalloc and cudaMemcpy  in  ReshapeCooKernel  and change back to row major order in a cuda dense tensor

* midify minor error

* test only cpu coo forward kernel

* add all test cases for coo forward kernel  (both cpu and gpu)

* test all forward kernels (coo, csr; cpu, gpu)

* add all test cases for all kinds of kernels

* just retrigger CI

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* resolve conflicts

* Update sparse_ops.yaml

* don't specify tensor place

* new shape has -1 or 0 in it

* Update unary_grad_kernel.h

* correct lvalue error

* code style

* Update sparse_backward.yaml

* Update sparse_ops.yaml

* Update unary_kernel.h

* Update unary.py

* Update sparse_backward.yaml

* Update unary.py

* code style

* code style

* code style

* Update unary.py

* specify tensor place explicitly

* do not use numpy array

* use numpy array in unit test again

* modify example code in docstring

abb38136

L
Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mode. (#47004) · f9c1cdc1
由 Leo Guo 提交于 10月 17, 2022
```
test=kunlun
```
f9c1cdc1
D
[Custom Device] Add singleton to custom device (#46963) · 73196e5a
由 duanyanhui 提交于 10月 17, 2022
```
* add singleton to custom device

* Update custom_device.cc

Init device_init_flag_ in default
```
73196e5a

14 10月, 2022 2 次提交
- R
  
  speed_up for deformable conv (#46997) · eee6b3a7
  由 Rayman 提交于 10月 14, 2022
  
  eee6b3a7
- W
  TRT pool2d adaptive mode bugfix (#46802) · eb32746a
  由 Wang Bojun 提交于 10月 14, 2022
```
* draft with debug print
```
  eb32746a
13 10月, 2022 7 次提交

Z
[Phi] Refactor logic of judging whether having a phi kernrel (#46920) · 8d797fd2
由 zyfncg 提交于 10月 13, 2022
```
* refind logic of choose phi kernrel

* fix complie budg
```
8d797fd2
X

logsumexp support fp16 (#45817) · 910e1b6a
由 xiaohemaikoo 提交于 10月 13, 2022

910e1b6a
[Zero-Dim] support 0D for paddle.transpose/reshape/stack/tile/unsqueeze (#46555) · 78add057
由 zhouweiwei2014 提交于 10月 13, 2022

78add057
C

fix softmax memory align (#46902) · 71748805
由 carryyu 提交于 10月 13, 2022

71748805

Revert #46111 (#46961) · cf9ca61d

由 Zhang Ting 提交于 10月 13, 2022

* Revert "【Hackathon No.56&38】deformable_conv_v1 算子实现 float16 数据类型支持&前向运行加速 (#46111)"

cf9ca61d

Z
Correct the logic and remove unnecessary template param (#46623) · 450af30c
由 Zhang Zheng 提交于 10月 13, 2022
```
* Correct the logic and remove unnecessary template param

* fix error throw

* fix print format

* fix ci
```
450af30c

[Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759

由 HongyuJia 提交于 10月 13, 2022

* remove PADDLE_WITH_MKLDNN, test white_list=abs

* fix unique_ptr

* fix op.Type()

* remove TODO in kernel_dispatch.h

* remove IndicateVarDataType function, update white_list

* remove mkldnn hard code

* add comments

* fix ==

* update mkldnn_op_list

* delete hard code of OPs

* update mkldnn_op_list

* update mkldnn_op_list, remove interp

* add error check for ExecutionContext

* update mkldnn_op_list, remove transpose2_grad

* remove interpolate mkldnn

* remove fill_constant mkldnn

* opt HasAttr in DygraphExecutionContext

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_black_list

* update mkldnn_op_list, add assert error op

* solve cudnn related op

* fix error

* add mkldnn fallback in phi_utils.cc

* remove mkldnn fallback in phi_utils.cc

* opt code implementation

* polish Copyright License

ef1c8759

12 10月, 2022 3 次提交
- Z
  Revert "remove comment (#46827)" (#46935) · 2ea3700a
  由 Zhang Ting 提交于 10月 12, 2022
```
This reverts commit 8a5f17e8.
```
  2ea3700a
- Z
  
  deliver indices_dict (#46919) · 4681f13b
  由 zhangkaihuo 提交于 10月 12, 2022
  
  4681f13b
- Z
  
  support generating code of opmaker for backward op invoke forward op (#46912) · 227ab74d
  由 zyfncg 提交于 10月 12, 2022
  
  227ab74d

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功