提交 · 97a9552698f6cc565bad2c29577e324c56e5b713 · Crayon鑫 / Paddle

07 5月, 2021 1 次提交
- T
  Refactor `dot` op's CPU kernel for better performance (#32589) · 97a95526
  由 Tongxin Bai 提交于 5月 07, 2021
```
* OP dot: refactor CPU kernels and get better loop performance.

* Minor fix on code format.

* Fixed minor errors.
```
  97a95526
06 5月, 2021 5 次提交
- C
  
  change parameter name from softmax_switch to use_softmax, test=develop · 28d42a94
  由 chajchaj 提交于 5月 06, 2021
  
  28d42a94
- R
  [ROCM] bugfix for unittest (#32392) · 31392627
  由 ronnywang 提交于 5月 06, 2021
```
* fix test_unpool_op

* fix test_inplace_addto_strategy

* fix test_conv2d_fusion_op

* fix test_imperative_lod_tensor_to_selected_rows, test_imperative_selected_rows_to_lod_tensor

* fix test_dot_op

* fix test_correlation_op

* fix tracer

* fix test_memcpy_op
```
  31392627
- G
  
  Fix bugs of pipeline on ascend. (#32737) · c5ae21f4
  由 gongweibao 提交于 5月 06, 2021
  
  c5ae21f4
- G
  add int64 support test=develop (#32736) · f1c68a08
  由 gongweibao 提交于 5月 06, 2021
```
add int64 support
```
  f1c68a08
- A
  
  Sum kernel for CPU supporting BF16 and SelectedRows (#32631) · 9599c3b3
  由 Adam Osewski 提交于 5月 06, 2021
  
  9599c3b3
30 4月, 2021 5 次提交

W

pylayer_op:release context after compute. (#32707) · 3cc11a3d
由 WeiXin 提交于 4月 30, 2021

3cc11a3d

Add 12 inplace APIs including auto generated (#32573) · 308073de

由 pangyoki 提交于 4月 30, 2021

* add relu6_ hardsigmoid_ leaky_relu_ Inplace APIs

* add softmax_with_cross_entropy_ Inplace API

* add clip_ scale_ add_ subtract_ Inplace APIs

* add wlist

* fix parameter of scale api

* add add_n_ Inplace API and remove log_ Inplace API

* fix elementwise_add_ and elementwise_sub_ broadcast problem

* elementwise inplace api give error message before run the op

* use broadcast_shape in elementwise inplace op

* add 8 inplace apis that is auto generated

* add unittest for all inplace apis

* add decorator for inplace apis in static mode

* fix windows blas fail of exp inplace api, change array_equal to allclose

* add flatten inplace api

* add flatten unittest

* fix flatten unittest

* add decorator

* fix grad.numpy in test_pylayer_op

* unsupport softmax_with_cross_entropy_

* add test_inplace_softmax_with_cross_entropy to static_mode_white_list

* delete __all__ in inplace_utils

* delete activation inplace function and add Tensor.inplace_func

* change paddle.inplace_ to Tensor.inplace_

* fix little problem

* add paddle in inplace_utils

308073de

C

remove is_test=True in grad (#32678) · bd8d35a2
由 ceci3 提交于 4月 30, 2021

bd8d35a2
B

add_c_sync_npu_kernel (#32687) · 8fd724a5
由 Baibaifan 提交于 4月 30, 2021

8fd724a5
J

Reduce grad fix (#32592) · 43527a2b
由 jakpiase 提交于 4月 30, 2021

43527a2b

29 4月, 2021 4 次提交
- L
  
  [NPU] refine FillNpuTensorWithConstant (#32682) · 0f578db9
  由 Leo Chen 提交于 4月 29, 2021
  
  0f578db9
- L
  Add op read_file and decode_jpeg (#32564) · b22f6d69
  由 LielinJiang 提交于 4月 29, 2021
```
* add op read_file and decode_jpeg
```
  b22f6d69
- W
  
  forward return any type. (#32661) · b6ca6a55
  由 WeiXin 提交于 4月 29, 2021
  
  b6ca6a55
- J
  Add BF16 uniform random initializer (#32468) · f46f15a0
  由 joanna.wozna.intel 提交于 4月 29, 2021
```
* Add bf16 uniform random initializer

* Remove duplicated section

* Change UT to CPU place only

* Put detail functions into anonymous namespace
```
  f46f15a0
28 4月, 2021 5 次提交

[NPU] add input EpsilonTensor for adam (#32605) · 119cda3d

由 Leo Chen 提交于 4月 28, 2021

* add input EpsilonTensor for adam

* update python api

* add unit test

* add npu test

* add more ut

119cda3d

A

Added pure_bf16 mode (#32281) · bc379ca3
由 arlesniak 提交于 4月 28, 2021

bc379ca3

Fix some error message (#32614) · 9ee709fc

由 Kqnonrime 提交于 4月 28, 2021

* fix two error message

* fix two error message

* fix error

* fix error

* fix error

* fix error

* fix some error message

* fix some error

* fix error

* fix some error

* fix some error

* fix some error

* fix one error

* fix some error

* fix seven error message

* fix error

* fix error

* fix error

* fix error

* fix some error message

* fix error

* fix some error

* fix some error

9ee709fc

J
[oneDNN] Added clearing oneDNN cache per executor (#32499) · ba610761
由 Jacek Czaja 提交于 4月 28, 2021
```
* - Added clearing oneDNN per executor

* - Executor is nt always having FLAGS_use_mkldnn set to true
```
ba610761

Optimize update_loss_scaling_op (#32554) · 0dc02dc7

由 jiangcheng 提交于 4月 28, 2021

* optimize update_loss_scaling_op by fused for loop to one kernel, test=develop

* remove useless while loop and optimize variable name, test=develop

* optimize variable name from out_addrs_tensor to out_addrs_mem, test=develop

* optimize variable name for readable by change prefix identifier from t_ to local_

0dc02dc7

27 4月, 2021 5 次提交
- L
  add alltoall api (#32507) · db41b742
  由 lilong12 提交于 4月 27, 2021
```
* add alltoall api, test=develop
```
  db41b742
- Z
  [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) · 1afe1ac9
  由 Zhong Hui 提交于 4月 27, 2021
```
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
```
  1afe1ac9
- Z
  
  Unify the implementation of activation operation (#32348) · eca8dcc7
  由 Zhang Zheng 提交于 4月 27, 2021
  
  eca8dcc7
- B
  
  slove develop bugs (#32560) · 6f6e159a
  由 Baibaifan 提交于 4月 27, 2021
  
  6f6e159a
- A
  
  Fix grad calculation bug in tensor_array_to_tensor (#32558) · 6579432f
  由 Aurelius84 提交于 4月 27, 2021
  
  6579432f
26 4月, 2021 5 次提交

Optimize where_index_op(prefix sum) (#30601) · 6ec4e640

由 jiangcheng 提交于 4月 26, 2021

* new optimize for where_index_op with prefix sum version.

* write a scan prefix sum kernel with stream for where index op.

* optimize where_index by using cub::DeviceScan::InclusiveSum instead of imperfect self-kernel.

* remove CheckTrue struct and rename stide_array for readable.

* optimize variable name for readable.

* optimize function name and annotation.

6ec4e640

W

[HybridParallel] fix port reuse when create multi group (#31876) · 41bfec8d
由 WangXi 提交于 4月 26, 2021

41bfec8d
S
[HybridParallel]Fix model parallel bug by using C++ op (#32536) · ea465fa5
由 ShenLiang 提交于 4月 26, 2021
```
* fix model parallel

* rm parallel_help.py

* add embedding
```
ea465fa5
W
support backward return None, when corresponding input tensor without gradient (#32494) · 8e66046b
由 WeiXin 提交于 4月 26, 2021
```
* support backward return None.

* edit unittest.

* edit code according to CI

* Improve error information
```
8e66046b

optimize slice op and slice grad op (#32266) · 5161f71a

由 jiangcheng 提交于 4月 26, 2021

* optimize slice op and slice grad op, test=develop

* optimize variable name and annotation information, test=develop

5161f71a

25 4月, 2021 9 次提交
- L
  
  [Setitem] Support grad computation of op set_value (#32431) · 25e723e7
  由 liym27 提交于 4月 25, 2021
  
  25e723e7
- B
  
  add copy_cross_scope (#32432) · 5943ff7b
  由 Baibaifan 提交于 4月 25, 2021
  
  5943ff7b
- Z
  
  fix gradient(nan) when two inputs are equal (#32448) · 1896c777
  由 Zhang Ting 提交于 4月 25, 2021
  
  1896c777
- Q
  
  [ROCM] update PADDLE_WITH_ROCM to PADDLE_WITH_HIP, test=develop (#32487) · 3b4dcad7
  由 Qi Li 提交于 4月 25, 2021
  
  3b4dcad7
- M
  
  add silu op, test=develop (#32384) · 2f351ed5
  由 minghaoBD 提交于 4月 25, 2021
  
  2f351ed5
- W
  [BUG FIX] when x.dim < y.dim, the result of compare_op is inverse (#32470) · 78eff521
  由 wawltor 提交于 4月 25, 2021
```
* fix bug: when x.dim < y.dim, the result of compare_op is inverse to expected result

* support the cuda for fix the compare broadcast bug
```
  78eff521
- C
  
  fix reader_blocking_queue_test (#32505) · 4db2cc90
  由 Chen Weihang 提交于 4月 25, 2021
  
  4db2cc90
- L
  [NPU] refine lookup_table_v2_grad npu_kernel (#32497) · fb7590d4
  由 Leo Chen 提交于 4月 25, 2021
```
* use ZerosLike instead of NPUMemsetAsync

* fix compile
```
  fb7590d4
- D
  Nne integration (#32255) · feb2e476
  由 denglin-github 提交于 4月 25, 2021
```
* Add dlnne engine runtime

* Fix log

* Remove <const_cast> and remove unrelated modify with dlnne, +clang-format

* Fix CMakeList format error

* Add copyright message

* Fix dlnne CMakeList.txt

* Add some paddlepaddle_pass to support more networks

* Fix some format bug
```
  feb2e476
23 4月, 2021 1 次提交
- L
  add the c_identity op (#32485) · 8fa8a37f
  由 lilong12 提交于 4月 23, 2021
```
* add c_identity op, test=develop
```
  8fa8a37f

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致