提交 · fd92d949c48137d13d0c4aa1f0dfcf806ebedc4a · PaddlePaddle / Paddle

16 8月, 2021 5 次提交
- Z
  Support npu op hard_swish and hard_swish_grad (#34608) · fd92d949
  由 zyfncg 提交于 8月 16, 2021
```
* Support NPU OP hard_swish and hard_swish_grad

* Support NPU OP hard_swish and hard_swish_grad

* add the unittest to compare the result between npu ans cpu

* format the prompt of exception

* replace Min and Max op by ClipByValue op

* fix the precision problem for fp16

* Using HardtanhGrad to improve performace
```
  fd92d949
- Z
  
  Add bcast semantics checks at C++ level to BroadcastTensorsOp (#34874) · e84b2e9b
  由 Zhanlue Yang 提交于 8月 16, 2021
  
  e84b2e9b
- L
  
  [NPU] remove npu int64 kernel for increment op (#34909) · 28279f6f
  由 Leo Chen 提交于 8月 16, 2021
  
  28279f6f
- T
  
  Check whl size (#34767) · 34d188bf
  由 tianshuo78520a 提交于 8月 16, 2021
  
  34d188bf
- R
  [NPU] add p_norm_op_npu (#34695) · 7316018d
  由 ronnywang 提交于 8月 15, 2021
```
* add p_norm_op_npu

* remove p_norm_grad op

* update
```
  7316018d
13 8月, 2021 9 次提交

由 Tongxin Bai 提交于 8月 13, 2021

* OP dot: refactor CPU kernels and get better loop performance.

* Minor fix on code format.

* Fixed minor errors.

* Add new API: einsum

* Update the Einsum unit test.

One case failed with matmul_v2, where the dtype is int64:

a = np.arange(2 * 3 * 1).reshape(2, 3, 1)
b = np.arange(1)
paddle.einsum("...i, ...i", a, b)

* Test cases in test_einsum test floating point dtypes only.

As of now Paddle only supports float/double dtypes in matmul, which is
one of building blocks of this Einsum implementation. We decide not to
test einsum against other dtypes.

* Polish format.

* More formatting.

* Format...

* Einsum: improve test coverage.

* Einsum: bug fixes and more testcases for testing error messages

* Einsum: fix format..

* Einsum: fixed typo and format.

* Einsum: format again...

* Einsum: applied suggested changes.

* Einsum API: improve API documentation.

* Einsum API: apply suggested changes.

* Einsum API: Add dygraph only note.

* Einsum API: Add dygraph only note.

* Einsum API: fixed unittest.

8c8667f0

Z

fix a bug of slice by none index (#34877) · ff4bdac3
由 zyfncg 提交于 8月 13, 2021

ff4bdac3

Bug fix : Can't load multiple modules of custom c++ op (#34505) · fc6b4a50

由 zyfncg 提交于 8月 13, 2021

* Fix a bug : can't load more than one custom op module

* Fix a bug : can't load more than one custom op module

* add test for load multiple modules of custom c++ op

* add config for Coverage CI

fc6b4a50

Z

fix generator thread safety bug (#34888) · f421741c
由 Zeng Jinle 提交于 8月 13, 2021

f421741c
Support sccache distributed storage on windows (#34879) · 8bc4d854
由 zhouweiwei2014 提交于 8月 13, 2021

8bc4d854
Q

[NPU] fix bce_loss_npu, test=develop (#34876) · 5b86b999
由 Qi Li 提交于 8月 13, 2021

5b86b999
R

fix npu_finalize (#34857) · 17a99760
由 ronnywang 提交于 8月 13, 2021

17a99760
B

add retry for gethostbyname (#34855) · e92f0388
由 Baibaifan 提交于 8月 13, 2021

e92f0388
A

[npu]add unsqueeze2_grad,test=develop (#34733) · 2164ad61
由 andyjpaddle 提交于 8月 13, 2021

2164ad61

12 8月, 2021 11 次提交
- Q
  
  [NPU] add meshgrid, test=develop (#34576) · 3f71e8d2
  由 Qi Li 提交于 8月 12, 2021
  
  3f71e8d2
- C
  Remove incorrect signal error stack trace (#34842) · 572adccd
  由 Chen Weihang 提交于 8月 12, 2021
```
* remove unmatched signal error stack

* fix error writing for cond
```
  572adccd
- C
  Revert "[oneDNN] Fix to issue #34554 (#34623)" (#34838) · dc62a227
  由 Chen Weihang 提交于 8月 12, 2021
```
This reverts commit 0a5c99e8.
```
  dc62a227
- fix set_grad_ivar bug of Tensor.backward (#34819) · dffb0b22
  由 zhouweiwei2014 提交于 8月 12, 2021
  
  dffb0b22
- W
  
  [Inference] Inference python api support fp16 (#34676) · 6326c3ef
  由 Wilber 提交于 8月 12, 2021
  
  6326c3ef
- F
  transformer c files (#34706) · 016cc56d
  由 Feng Xing 提交于 8月 12, 2021
```
This PR adds fused transformer related files defining c interface including class, function etc..
```
  016cc56d
- Z
  Fix safety-bug of functional.linear (#34696) · 0e28c8bb
  由 zhulei 提交于 8月 12, 2021
```
* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear
```
  0e28c8bb
- S
  [HybridParallel]Add Recompute for PipeLineParallel (#34607) · 589d13c5
  由 ShenLiang 提交于 8月 12, 2021
```
* add recompute for pp

* add recompute offload

* add recompute partition
```
  589d13c5
- W
  
  [NPU] Support npu kernel for smooth_l1_loss op (#34674) · cfa69133
  由 wuhuachaocoding 提交于 8月 12, 2021
  
  cfa69133
- F
  [NPU] Support npu op expand_v2 and expand_v2_grad (#34764) · bc543e35
  由 Fan Zhang 提交于 8月 12, 2021
```
* [NPU] Support npu op expand_v2 and expand_v2_grad

* [NPU] Support npu op expand_v2 and expand_v2_grad

* [NPU] Support npu op expand_v2 and expand_v2_grad

* update test_expand_v2_op_npu.py

* update test_expand_v2_op_npu.py

* modify expand_v2_op_npu.cc

* modify expand_v2_op_npu.cc
```
  bc543e35
- P
  add det_mv3_db & LeViT test case in pr-ci-inference (#34803) · 1c31d9d3
  由 Peihan 提交于 8月 12, 2021
```
* add det_mv3_db & LeViT test case in pr-ci-inference

* fix LeViT model dir bugs

* fix grammar error
```
  1c31d9d3
11 8月, 2021 15 次提交

[oneDNN] Fix to issue #34554 (#34623) · 0a5c99e8

由 Jacek Czaja 提交于 8月 11, 2021

* - Added softmax without caching

* - Binary is no longer manually cached

* - Activation onednn caching removed

* - Removed manual caching of activation

* - modified UT

* - fix

* - fix

* - fixes to building

* - fix

* - fix

* - fix to UT

* - Faulty UT workaround

* - approval workaround

* - Fixes after review

* - compilation fixes

* - more lint fixes

* - more fixes after review

* - fixes after another round of review

0a5c99e8

`set_value_grad` propagate gradients to `Input` and `TensorValue` (#34304) · 9d02313c

由 WeiXin 提交于 8月 11, 2021

* add set_value_grad op

* add unittest.

* polish unittest.

* polish code.

* support cuda kernel

* polish code according to CI

* polish code.

* polish code

* remove *.pyc

* polish code.

* add unittest to improve coverage.

* polish code.

9d02313c

W
[Paddle TRT]fix_fc_int8_convert; fix_reshape_convert (#34787) · 3429c04b
由 Wangzheee 提交于 8月 11, 2021
```
* fix_fc_reshape_convert

* fix
```
3429c04b
F

[NPU] Support npu op flatten_contiguous_range_grad (#34798) · fc537d4f
由 Fan Zhang 提交于 8月 11, 2021

fc537d4f
P
[NPU] add while, read_from_array and write_to_array npu op (#34755) · 234c21ac
由 pangyoki 提交于 8月 11, 2021
```
* add while read_from_array write_to_array npu op

* optimize unittest
```
234c21ac
R

split_op for npu (#34699) · d45d3112
由 Roc 提交于 8月 11, 2021

d45d3112
R
[NPU] add momentum_op_npu and test (#34082) · 9e3e08f0
由 ronnywang 提交于 8月 11, 2021
```
* add momentum_op_npu and test

* update

* fix hang
```
9e3e08f0
R
[NPU] add reduce_mean_op_npu and test (#34053) · f6fab559
由 ronnywang 提交于 8月 11, 2021
```
* add reduce_mean_op_npu and test

* remove skip.If

* update
```
f6fab559
R
[NPU] add batch_norm_op_npu and test (#34056) · 9ed5db28
由 ronnywang 提交于 8月 11, 2021
```
* add batch_norm_op_npu and tests

* remove skip.If

* fix bug
```
9ed5db28

Add ext_tensor.slice() API (#34227) · 3f011d82

由 Hao Lin 提交于 8月 11, 2021

* Add ext_tensor.slice() API, test=develop

* Call Tensor::mutable_data first to fix bugs and add test for writing to sliced tensor

* Fix unit test bug

* Fix code format problem, test=develop

* Fix code format problem

* Fix code format problem

* strengthen unit test

* Use CustomTensorUtils::ShareDataFrom to simplify codes

3f011d82

L
add the basic apis for auto_parallel (#33804) · 3f962e77
由 lilong12 提交于 8月 11, 2021
```
* add auto_parallel apis
```
3f962e77

[NPU] Add exp and exp_grad npu op (#34612) · b5ec65e1

由 0x45f 提交于 8月 11, 2021

* add exp and exp_grad npu op

* modify support register type

* remove empty line and remove exp_grad support data type int/int64

* move exp and epx_grad kernel to activation_op_npu.cc, delete attrs

* move code to activation_op_npu.cc

b5ec65e1

A

[NPU] add elementwise_min_grad_op_npu,test=develop (#34731) · 45af4f2a
由 andyjpaddle 提交于 8月 11, 2021

45af4f2a
W

miss format (#34771) · addd5fce
由 wenbin 提交于 8月 11, 2021

addd5fce
N

modified reduce_sum_op and reduce_mean_op for higher_performance (#32885) · 6a9fac14
由 niuliling123 提交于 8月 11, 2021

6a9fac14

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功