提交 · e19195f795db4d5c7de816abe025c3d16d978ba9 · BaiXuePrincess / Paddle

09 3月, 2021 1 次提交

Support npu kernel for gather op (#31458) · e19195f7

由 xiayanming 提交于 3月 09, 2021

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

e19195f7

08 3月, 2021 2 次提交

[NPU] add npu kernel for communication op (#31437) · 15823bb0

由 lw921014 提交于 3月 08, 2021

* add allreduce and broadcast without test

* add c_broadcast_test case

* build c_comm_init and c_create_group operators

* make the whole thing compile

* add broadcast and init op test case but run failed

* make unit test compile

* fix broadcast test bug and change into hcom for ccl

* change c_comm_init and c_create_group ops accordingly

* make tests compile

* transfer code to 27

* compiled successfully in 28, but run failed

* test broadcast in 28, but failed

* make hcom primitives work

* change hccl data type for base.h

* fix broadcast bug

* make attributes work

* fix group name bug

* add allreduce but test failed

* allreduce bug for qiuliang

* allreduce finished

* add allgather and reducescatter

* merge all op code

* add allgather test

* finish run all ccl op test exclude send/recv

* all all op and test exclude send/recv

* send_v2_npu.cc recv_v2_npiu.cc compiled

* fix ccl core dump bug and test allgather, reducescatter, broadcast op

* fix allreduce bug just for test

* hcom send&recv test pass, without hcom_destroy

* for qiuliang test

* Ascend Send&Recv Test Pass

* all op (ex send/recv) ok

* fix bug

* merge all ccl op

* style merge to PaddlePaddle

* merge style

* new merge style

* merge style 2

* insert an empty at the end

* disable ctest for hcom to pass ci
Co-authored-by: Nvoid-main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>

15823bb0

R
[NPU] squeeze and unsqueeze op for ascend (#31452) · 388c69f2
由 Reventon_L 提交于 3月 08, 2021
```
Co-authored-by: Nroot <xiayanming@baidu.com>
```
388c69f2

05 3月, 2021 1 次提交
- L
  
  Fix pow, refine code (#31440) · 83f81eb5
  由 Leo Chen 提交于 3月 05, 2021
  
  83f81eb5
04 3月, 2021 4 次提交
- L
  
  Fix pow, use fillD instead of broadcast (#31433) · 5fe3d596
  由 Leo Chen 提交于 3月 04, 2021
  
  5fe3d596
- Z
  
  fix endif (#31431) · ecc6e213
  由 zhang wenhui 提交于 3月 04, 2021
  
  ecc6e213
- Z
  [NPU] Support npu kernel for shape op (#31427) · b3c88e96
  由 zhang wenhui 提交于 3月 04, 2021
```
* add shape npu

* fix

* fix
```
  b3c88e96
- L
  [NPU] add npu kernel for equal op (#31393) · ac3d821b
  由 Leo Chen 提交于 3月 04, 2021
```
* add npu kernel for equal op

* refine code

* add more ut

* update year
```
  ac3d821b
02 3月, 2021 3 次提交

[NPU] Support npu op layer_norm and layer_norm_grad (#31310) · 0310945f

由 Leo Chen 提交于 3月 02, 2021

* init commit, add layer_norm npu kernel

* fix typo

* add unittest

* add unittest

* fix bug

* fix bug

* refine ut

0310945f

V
Refactor HCCLCommContext to be compatible with Paddle (#31359) · 45765d6e
由 Void Main 提交于 3月 02, 2021
```
Refactor HCCLCommContext to be compatible with Paddle (#31359)
```
45765d6e

[NPU] add npu kernel for elementwise_add_grad (#31347) · 8497e2aa

由 Leo Chen 提交于 3月 02, 2021

* fix reading flags from env

* fix problem caused by async run

* support partial grad

* support elementwise_add_grad npu kernel

* add unittest

* fix bug?

8497e2aa

01 3月, 2021 3 次提交
- L
  add allreduce and broadcast without test (#31024) · 9fcdaeba
  由 lw921014 提交于 3月 01, 2021
```
add allreduce and broadcast without test
```
  9fcdaeba
- L
  
  [NPU] Support npu op: (1) slice (2) slice_grad (#31275) · a1ddff81
  由 liym27 提交于 3月 01, 2021
  
  a1ddff81
- L
  support list of list attribute for NPU (#31299) · d23bf89c
  由 Leo Chen 提交于 3月 01, 2021
```
* support list of list attribute for NPU

* fix compile problem

* fix reference
```
  d23bf89c
26 2月, 2021 1 次提交
- L
  [NPU] Support npu op pow and pow grad (#31247) · 187248f5
  由 liym27 提交于 2月 26, 2021
```
* [NPU] Support npu op: (1) pow (2) pow_grad

* Support fp16
```
  187248f5
25 2月, 2021 2 次提交
- L
  
  Fix typo of selected_npus (#31230) · d45f5d78
  由 Leo Chen 提交于 2月 25, 2021
  
  d45f5d78
- L
  refactor npu device manager (#31154) · ff4654e2
  由 Leo Chen 提交于 2月 25, 2021
```
refactor npu device manager (#31154)
```
  ff4654e2
23 2月, 2021 2 次提交
- L
  [NPU] Support executor with NPU (#31057) · 1435b4c0
  由 liym27 提交于 2月 23, 2021
```
* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu
```
  1435b4c0
- L
  Fix compilation problem (#31100) · 85cbd556
  由 Leo Chen 提交于 2月 23, 2021
```
Fix compilation problem (#31100)
```
  85cbd556
22 2月, 2021 1 次提交

add npu kernel for elementwise_sub and elementwise_sub_grad (#30973) · 5cb20f30

由 Leo Chen 提交于 2月 22, 2021

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

5cb20f30

09 2月, 2021 3 次提交

[feature] support npu allocator, part 2 (#30972) · 1201cd2e

由 Leo Chen 提交于 2月 09, 2021

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

1201cd2e

L
[feature] support npu operator (#30951) · 7e049108
由 Leo Chen 提交于 2月 09, 2021
```
[feature] support npu operator
```
7e049108
L
[feature] support npu allocator (#30840) · 81138239
由 Leo Chen 提交于 2月 09, 2021
```
[feature] support npu allocator
```
81138239

08 2月, 2021 1 次提交
- G
  Destroy session first. (#30954) · ebef6601
  由 gongweibao 提交于 2月 08, 2021
```
Destroy session first.
```
  ebef6601
28 1月, 2021 1 次提交
- L
  Dev/fix ascend string (#30749) · 88dfd067
  由 Leo Chen 提交于 1月 28, 2021
```
Dev/fix ascend string
```
  88dfd067
27 1月, 2021 1 次提交
- L
  fix compilation on ascend-20.1 (#30722) · 6eabbc80
  由 Leo Chen 提交于 1月 27, 2021
```
fix compilation on ascend-20.1
```
  6eabbc80
21 1月, 2021 2 次提交
- G
  Add Hccl program group (#30642) · e4287ca6
  由 gongweibao 提交于 1月 21, 2021
```
Add Hccl program group
```
  e4287ca6
- G
  Add distribution supported (#30578) · f9c97dd7
  由 gongweibao 提交于 1月 21, 2021
```
Add distribution supported
```
  f9c97dd7
15 1月, 2021 5 次提交
- G
  Fix compilcation on CANN20.1 and older (#30494) · 1882f2ce
  由 gongweibao 提交于 1月 15, 2021
```
Fix compilcation on CANN20.1 and older 
```
  1882f2ce
- H
  
  Ascend rc (#30483) · 6dd52c5b
  由 hutuxian 提交于 1月 15, 2021
  
  6dd52c5b
- 石
  
  export global google flags to users, test=develop (#30448) · 715d8628
  由石晓伟提交于 1月 15, 2021
  
  715d8628
- W
  
  fix cache key for inplaced elementwise ops (#30404) · 88fc7a7d
  由 Wojciech Uss 提交于 1月 15, 2021
  
  88fc7a7d
- W
  fix the rnn mask memory bug for out of read (#30459) · 3d49882e
  由 wawltor 提交于 1月 15, 2021
```
* fix the rnn mask memory bug for out of read

* update the code for the rnn
```
  3d49882e
14 1月, 2021 5 次提交
- T
  
  support transformer v2.0 (#30381) · 6a3c8725
  由 taixiurong 提交于 1月 14, 2021
  
  6a3c8725
- S
  
  fix flatten api grad (#30426) · e85be1b1
  由 ShenLiang 提交于 1月 14, 2021
  
  e85be1b1
- Y
  
  Heter ps new (#30198) · 6e0da01c
  由 yaoxuefeng 提交于 1月 14, 2021
  
  6e0da01c
- 1
  test=develop, add distributed_infer (#30300) · 2a98e932
  由 123malin 提交于 1月 14, 2021
```
* test=develop, add distributed_infer
```
  2a98e932
- Q
  
  fix bug that cann't find mkldnn(kunlun) (#30394) · cf786d22
  由 QingshuChen 提交于 1月 14, 2021
  
  cf786d22
13 1月, 2021 2 次提交

C
skip quantizing ops in cpu inference (#30342) · 8e3a2940
由 cc 提交于 1月 13, 2021
```
* skip quantizing ops in cpu inference, test=develop
```
8e3a2940

Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5

由 alncat 提交于 1月 13, 2021

* added support for inference using qunatization aware trained dygraph

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* Delete incorrect warning message (#30196)

* fix warning and no grad

* clean redundant API alias in 2.0 - part 2 (#30013)

* delete paddle.nn.functional.assign

* fix dynamic to static error

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* Add Static Variable Clone (#30208)

Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat

* use wget to replace curl to download the lcov file (#30229)

* use wget to replace curl to download the lcov file

* add cache for lcov

* fix test_pool3d_op timeout issue (#30248)

* Fix unittests bugs. (#30250)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* Fix bug for 'save mutiple method' (#30218)

* Fix bug for 'save mutiple method'

* To pass coverage.

* edit code to pass coverage.

* edit code to pass coverage.

* add unittest for coverage.

* change for coverage.

* edit for coverage.

* added support for inference using qunatization aware trained dygraph

* Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)

* add alias from  fluid.layers.auc to static.auc

* Update __init__.py

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* corrected boost get usage

* corrected naming issues and enforcing zero check

* correct paddle enforce message

* added more error checkings

* corrected error report message and optimized code

* corrected findvar usage

* corrected paddle_enforce in scope

* correct error messages

* correct error reporting format
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
Co-authored-by: NBai Yifan <me@ethanbai.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>

7bbf3ac5

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致