提交 · ceee71a0ef63cee73e91d920c5739145bf4bf735 · Crayon鑫 / Paddle

27 8月, 2021 8 次提交

Add unpool2d op & Expose max_unpool2d API (#35056) · ceee71a0

由 xiaoting 提交于 8月 27, 2021

* add maxunppol2d op, test=develop

* fix typo, test=develop

* fix unpool unitest, test=develop

* fix unpool code-example, test=develop

* fix for unpool_op_unittest,test=develop

* fix example code, test=develop

* add noqa:F401, test=develop

* fix converage, test=develop

* fix unitest for unpool, test=develop

* rename unpool2d to unpool, test=develop

* rename unpool2d to unpool, test=develop

ceee71a0

G
sparse_momentum_op is used to save w@GRAD memory for gather_op (#34942) · 234ce932
由 Guoxia Wang 提交于 8月 27, 2021
```
* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter
```
234ce932

Add fusion_gru and multi_gru to PTQ (Post-Training Quantization) (#33749) · 7debae3a

由 joanna.wozna.intel 提交于 8月 27, 2021

* Add calculation for gru op

* Correct the types

* Remove mkldnn only

* Correct mkldnn ifdef

* Remove mkldnn ifdef

* Separate mkldnn quantizer test

* Correct Windows test

* Check different cmake fix

* Revert cmake change

* Cmake change 2

* Cmake change 3

7debae3a

A
Polish DeviceEvent interface and Remove #ifdef in InterpreterCore (#35196) · 48bf7cbf
由 Aurelius84 提交于 8月 27, 2021
```
* add CPUDeiveEvent

* Polish DeviceEvent code

* Add DEVICE_EVENT_LIBS
```
48bf7cbf
Z

gelu/logsigmoid add AsExtra (#35198) · 2006fbc4
由 zhupengyang 提交于 8月 27, 2021

2006fbc4

add elementwise max grad op for npu (#34862) · 5310ceab

由 baoachun 提交于 8月 27, 2021

* add elementwise max grad op for npu

* add elementwise max grad op for npu

* add elementwise max grad op for npu

* add elementwise max grad op for npu

* add elementwise max grad op for npu

5310ceab

W
Polish the error message of paddle.slice. (#35179) · 669853f5
由 WeiXin 提交于 8月 27, 2021
```
* polish the error message of paddle.slice.

* polish code.
```
669853f5
Z
Revert "Add copy from tensor (#34406)" (#35173) · 32c1ec42
由 zhangchunle 提交于 8月 27, 2021
```
This reverts commit ac33c0ca.
```
32c1ec42

26 8月, 2021 16 次提交

[oneDNN] disable caching oneDNN primitives in matmul v2, Reduce grad and... · 31f0221f

由 Jacek Czaja 提交于 8月 26, 2021

[oneDNN] disable caching oneDNN primitives in  matmul v2, Reduce grad and elementwise_add grad, expand_v2 (#35132)

* - grad caching disabled of matmul_v1

- compilation fix

- compilation fix

* - reduction removed

* - Matmul v2 disabled caching

* Draft of further changes

* - workaround for reducegrad

* - fixes to UT

* - fix to compilation

* - another fix

* - fix

31f0221f

S
Add paddle.utils.dlpack APIs (#35067) · 8dc050d8
由 Siming Dai 提交于 8月 26, 2021
```
* add dlpack api and fix a from_dlpack 
```
8dc050d8

fix assign bug support fp16 uint8 (#35153) · 270efb96

由 duanboqiang 提交于 8月 26, 2021

* fix assign bug support fp16 uint8

* fix dygragh assign bool bug

* modify code style

* revoke bool modification

270efb96

gc for newexecutor (#35085) · f1472039

由 wanghuancoder 提交于 8月 26, 2021

* gc for newexecutor, test=develop

* refine, test=develop

* add interpretercore_gc_helper.h,test=develop

* backup

* gc whit thread and device_event, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* fix bug, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* add CheckGC, test=develop

f1472039

S
Support dropout backward in eval mode (#35122) · f1275fb6
由 smallv0221 提交于 8月 26, 2021
```
* Support dropout backward in eval mode

* add downscale case

* minor fix

* minor fix
```
f1275fb6
W
support tensor index. (#34824) · e7df47ec
由 WeiXin 提交于 8月 26, 2021
```
* polish code

* polish code.

* polish code.

* polish code.

* polish code.
```
e7df47ec

Support Multi-Stream, Single-Thread in New Executor (#35024) · 678a259a

由 Aurelius84 提交于 8月 26, 2021

* Modify into QueueSync QueueAsync

* fix complie on MacOS

* fix pointer

* fix conflict

* polish unittest

* fix windows fetch error

* polish code according reviewer

* fix device_guard on CPU place

678a259a

Add feed_forward for fused attention op. (#34945) · d1a33bc7

由 Li Min 提交于 8月 26, 2021

Describe

Add feed_forward for fused attention op.
(1) Encapsulate matmul impl (forward and backward) used in attention op.
(2) Implement bias_add (forward and backward) used in attention op.

d1a33bc7

B

[NPU] Support npu kernel for StridedSlice op without grad (#34601) · fa6c59a4
由 Bo Liu 提交于 8月 26, 2021

fa6c59a4

Add copy from tensor (#34406) · ac33c0ca

由 Shang Zhizhou 提交于 8月 26, 2021

* add api

* temp save

* revert

* copytocpu async ok

* fix style

* copy sync ok

* fix compile error

* fix compile error

* api done

* update python async api

* fix compile

* remove async python api; add c++ async unittest

* remove python async api

* update unittest

* update unittest

* add C++ unittest for copytensor

* add unittest

* update namespace utils to class TensorUtils

* add unittest

* update unittest

* update unittest

* update code style

* update code style

* update unittest

ac33c0ca

S
Add roi align op npu (#34973) · 289e1818
由 shiyutang 提交于 8月 26, 2021
```
* add_roi_align_npu

* update

* update

* update
```
289e1818
W

[Inference] Replace unordered_map with map to support subgraph stability (#35147) · a1aae040
由 Wilber 提交于 8月 26, 2021

a1aae040
L

add temporary MultiThreadedWorkQueue (#35158) · e4a8815d
由 liutiexing 提交于 8月 26, 2021

e4a8815d
D

fix cast op (#35156) · 412877e6
由 duanboqiang 提交于 8月 26, 2021

412877e6
X

fix the bug of channel-wise quantization for ernie (#34948) · c71025eb
由 XGZhang 提交于 8月 26, 2021

c71025eb
W
use spinlock in auto growth (#35139) · 0efda9d9
由 wanghuancoder 提交于 8月 26, 2021
```
* use spinlock in auto growth, test=develop

* refine,test=develop
```
0efda9d9

25 8月, 2021 11 次提交
- P
  
  disable test_resnet50_quant (#35149) · b3ef9a68
  由 Peihan 提交于 8月 25, 2021
  
  b3ef9a68
- W
  fix cmaklist for new executor (#35137) · 03cb3132
  由 wanghuancoder 提交于 8月 25, 2021
```
* fix cmaklist for new executor, test=develop

* refine, test=develop

* refine, test=develop
```
  03cb3132
- P
  Modify ci time count & fix resnet50_quant multi_thread tests (#35141) · 9e54209d
  由 Peihan 提交于 8月 25, 2021
```
* Modify ci time count & fix resnet50_quant multi_thread tests

* fix wrong time variable
```
  9e54209d
- J
  Fix for expand_v2 op (#35101) · 1f34f7ec
  由 jakpiase 提交于 8月 25, 2021
```
* temporary change

* fix for expand_v2

* changes after review, activated ppyolov inference test
```
  1f34f7ec
- Z
  
  fix cpu adamw problem for np.float64 (#35124) · 700205e8
  由 zhaoyingli 提交于 8月 25, 2021
  
  700205e8
- R
  
  [NPU] Fix the performance problem when 'axis' is not specified (#35116) · 91ba86b1
  由 ronnywang 提交于 8月 25, 2021
  
  91ba86b1
- L
  fix potential tensor leak in tensor.__setitem__ (#35013) · 763b6d91
  由 Leo Chen 提交于 8月 25, 2021
```
* fix index tensor leak in __setitem__

* fix another usage of PyTuple_Pack

* refine code

* refine code

* handle None index

* add Py_DecRef

* revert ut

* refine code

* merge develop

* use RAII

* follow comments
```
  763b6d91
- Y
  
  [hybrid performance] optim npu coalesce set constant (#35105) · 4bfd0445
  由 Yuang Liu 提交于 8月 25, 2021
  
  4bfd0445
- R
  
  [NPU] add npu_one_hot_v2 (#34937) · d710c3a0
  由 ronnywang 提交于 8月 25, 2021
  
  d710c3a0
- L
  
  high-performance SingleThreadedWorkQueue (#35086) · 751a7942
  由 liutiexing 提交于 8月 25, 2021
  
  751a7942
- T
  
  update elementwise api in kunlun (#35021) · ff96a7d5
  由 taixiurong 提交于 8月 25, 2021
  
  ff96a7d5
24 8月, 2021 5 次提交

G

Add flags to control whether to check Nan value of hccl_allreduce_sum. (#35093) · 5b737834
由 gongweibao 提交于 8月 24, 2021

5b737834

add fetch, test=develop (#35019) · a5060b55

由 wanghuancoder 提交于 8月 24, 2021

* add fetch, test=develop

* fix fetch2op, test=develop

* fix fetch2op, test=develop

* refine, test=develop

* fix fetch ctx, test=develop

* add wait, test=develop

* rename fetch2 to fetch_v2, test=develop

* merge, test=develop

a5060b55

Add no_sync in data parallel for dynamic graph (#34740) · b09f4d7f

由 Haohongxiang 提交于 8月 24, 2021

* Add no_sync in data parallel for dynamic graph

* modify UT of no_sync

* delete test_parallel_dygraph_dataparallel_no_sync.py

* add test_parallel_dygraph_no_sync.py

* modify run_trainer_with_spawn in UTs

* Add UT of complex control flow in no_sync

* add specific descriptions and notes for no_sync

* check code style

* modify UT's TIMEOUT in CMakeLists.txt

b09f4d7f

Q

[NPU] fix NPU ci scripts, test=develop (#35095) · a332352a
由 Qi Li 提交于 8月 24, 2021

a332352a
D
fix bmm bug (#35098) · de645153
由 duanboqiang 提交于 8月 24, 2021
```
* fix bmm bug

* bmm style

* fix bmm
```
de645153

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致