提交 · 28521e0f710916d0f572d688b6c408a83a40e590 · Crayon鑫 / Paddle

15 6月, 2021 1 次提交
- W
  Save all the information of 'ParamBase' in 'Layer'. (#33500) · 28521e0f
  由 WeiXin 提交于 6月 15, 2021
```
* Save all the information of 'ParamBase' in 'Layer'.

* edit unittest
```
  28521e0f
09 6月, 2021 1 次提交
- W
  cache core.globals() to speed up dynamic graph (#32098) · b4954ce4
  由 wanghuancoder 提交于 6月 09, 2021
```
* modify API nn.Bilinear's doc, test=develop
```
  b4954ce4
07 6月, 2021 1 次提交
- Z
  
  fix too-many-format-args (#33353) · 599e9e48
  由 zhangchunle 提交于 6月 07, 2021
  
  599e9e48
27 5月, 2021 1 次提交
- Q
  
  [ROCM] add is_compiled_with_rocm api, test=develop (#33043) · 6a5b7e59
  由 Qi Li 提交于 5月 27, 2021
  
  6a5b7e59
20 5月, 2021 1 次提交
- L
  
  Polish code for setitem and getitem (#32911) · 848cabfc
  由 liym27 提交于 5月 20, 2021
  
  848cabfc
18 5月, 2021 1 次提交

[Dy2Static] Refactor param_guard logic of @to_static (#32867) · b8d493df

由 Aurelius84 提交于 5月 18, 2021

* Add param_guard in ParameterList to support @to_static

* Refactor param_guard of @to_static

* fix unittest failed

* add more unittest

b8d493df

14 5月, 2021 1 次提交
- L
  
  Polish code for _getitem_impl_ (#32868) · 096b2f5a
  由 liym27 提交于 5月 14, 2021
  
  096b2f5a
12 5月, 2021 1 次提交

add varbasecopy func to fix the ParamBase type bug in layers.to API (#32789) · 067f558c

由 chentianyu03 提交于 5月 12, 2021

* add varbasecopy func to fix the paraBase type bug in layers.to API

* overload _copy_to func for ParamBase

* add xpuplace

* add waiting varbsecopy completion when not blocking

* fix dst_device bug

* modify varbase to shared_ptr

067f558c

06 5月, 2021 1 次提交
- G
  
  Fix bugs of pipeline on ascend. (#32737) · c5ae21f4
  由 gongweibao 提交于 5月 06, 2021
  
  c5ae21f4
28 4月, 2021 1 次提交
- C
  Add fake interface for register_hook in static mode (#32642) · 9aad7527
  由 Chen Weihang 提交于 4月 28, 2021
```
* add fake interface for hook in static mode

* add unittests

* fix failed unittests
```
  9aad7527
27 4月, 2021 1 次提交

[Docs] Modified the docs of some api for supporting list/tuple args. (#32360) · 15158927

由 xiemoyuan 提交于 4月 27, 2021

* fixed docs.

* Fixed docs. test=document_fix

code bak.

fixed docs. test=document_fix

* Revert to previous version of python/paddle/fluid/backward.py

* fixed bugs.

* test=document_fix. Fixed examples.

15158927

26 4月, 2021 1 次提交
- Y
  Unset ReserveSpace of batch_norm for inference program. (#32493) · 202b0eaf
  由 Yiqun Liu 提交于 4月 26, 2021
```
* Unset ReserveSpace for inference program.

* Support training from an inference program.
```
  202b0eaf
25 4月, 2021 2 次提交
- B
  
  add copy_cross_scope (#32432) · 5943ff7b
  由 Baibaifan 提交于 4月 25, 2021
  
  5943ff7b
- Z
  
  add detail for gpu_id, document_fix (#32444) · 136ef09d
  由 Zhang Ting 提交于 4月 25, 2021
  
  136ef09d
21 4月, 2021 1 次提交

【NPU】Merge NPU ccl code (#32381) · c3158527

由 zhang wenhui 提交于 4月 21, 2021

* add allreduce and broadcast without test (#31024)

add allreduce and broadcast without test

* Refactor HCCLCommContext to be compatible with Paddle (#31359)

Refactor HCCLCommContext to be compatible with Paddle (#31359)

* [NPU] add npu kernel for communication op (#31437)

* add allreduce and broadcast without test

* add c_broadcast_test case

* build c_comm_init and c_create_group operators

* make the whole thing compile

* add broadcast and init op test case but run failed

* make unit test compile

* fix broadcast test bug and change into hcom for ccl

* change c_comm_init and c_create_group ops accordingly

* make tests compile

* transfer code to 27

* compiled successfully in 28, but run failed

* test broadcast in 28, but failed

* make hcom primitives work

* change hccl data type for base.h

* fix broadcast bug

* make attributes work

* fix group name bug

* add allreduce but test failed

* allreduce bug for qiuliang

* allreduce finished

* add allgather and reducescatter

* merge all op code

* add allgather test

* finish run all ccl op test exclude send/recv

* all all op and test exclude send/recv

* send_v2_npu.cc recv_v2_npiu.cc compiled

* fix ccl core dump bug and test allgather, reducescatter, broadcast op

* fix allreduce bug just for test

* hcom send&recv test pass, without hcom_destroy

* for qiuliang test

* Ascend Send&Recv Test Pass

* all op (ex send/recv) ok

* fix bug

* merge all ccl op

* style merge to PaddlePaddle

* merge style

* new merge style

* merge style 2

* insert an empty at the end

* disable ctest for hcom to pass ci
Co-authored-by: Nvoid-main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>

* Add auto-increasing tag id for Hcom OPs (#31702)

* add c_reduce_sum op (#31793)

add c_reduce_sum op

* update Ascendrc hccl to 20.3 (#32126)

update Ascendrc hccl to 20.3 (#32126)

* fix merge code

* change cmake.txt1

* [NPU] Support npu kernel for c sync stream op (#31386)

* sync stream npu op

* add with_ascend_acl

* update c++ unittest

* compile all failed

* try to pre commit

* after pre commit

* merge&compile&test hccl successfully!

* fix code style

* fix code style

* fix bugs about hccl

* fix some bugs

* fix code style

* fix style

* fix style

* fix

* fixed

* merge develop
Co-authored-by: Nlw921014 <liuwei921014@yeah.net>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>
Co-authored-by: Nxiayanming <41795079@qq.com>

c3158527

15 4月, 2021 1 次提交
- F
  fix test sync_with_cpp (#32212) · 0c037d2d
  由 fangshuixun007 提交于 4月 15, 2021
```
fix test sync_with_cpp (#32212)
```
  0c037d2d
09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

02 4月, 2021 1 次提交

support save/load single tensor (#31756) · 43367e4b

由 WeiXin 提交于 4月 02, 2021

* support save/load single tensor

* compatibility modification according to unnittest

* Some python2.7 don't have 'copyreg' modules

* Handle a syntax error.

* Dealing with compatibility problems on Mac.

* Dealing with compatibility problems on Mac.

* edit unittest to improve coverage.

* Modify the code according to the review comments

* Reduce redundant code.

* support for static graph loading dygraph state_dict

* edit code according to CI

* edit unittest

* edit unnittest

* delete redundant file

* edit code according to Comments

* edit english doc

* edit english doc

* edit English DOC.

* get/set_tensor->get/set_value; return_numpy=False

* get/set_tensor->get/set_value; return_numpy=False

* edit unnittest

* edit unnittest

* polish code.

43367e4b

30 3月, 2021 2 次提交
- Z
  [Custom OP]Remove old custom OP and reduce whl package volume (#31813) · 04a49b09
  由 Zhou Wei 提交于 3月 30, 2021
```
* Remove old custom OP to reduce whl package volume

* [Custom OP]Remove old custom OP to reduce whl package volume
```
  04a49b09
- A
  Fix segment Fault from set_value (#31891) · c4b60efa
  由 Aurelius84 提交于 3月 30, 2021
```
* Avoid raising warning while import paddle

* fix segment fault of set_value

* fix code style
```
  c4b60efa
29 3月, 2021 1 次提交
- L
  
  Fix bug of set_value op：Decerease axes to do right broadcast (#31875) · 525c32e3
  由 liym27 提交于 3月 29, 2021
  
  525c32e3
23 3月, 2021 1 次提交
- F
  
  add coalesce_tensor into white list when checking re-creation of parameters (#31800) · 4046f130
  由 Feiyu Chan 提交于 3月 23, 2021
  
  4046f130
17 3月, 2021 1 次提交
- L
  
  In __getitem__, convert integers to int64 Tensor not int32 to be compatible with Lite(#31658) · 402288ad
  由 liym27 提交于 3月 17, 2021
  
  402288ad
10 3月, 2021 1 次提交
- W
  
  Add collective async wait op (#31463) · 83a2fb1f
  由 WangXi 提交于 3月 10, 2021
  
  83a2fb1f
20 2月, 2021 1 次提交

[static setitem] Support the index is Tensor; step>1; step<0 .(#30949) · 5b367dab

由 liym27 提交于 2月 20, 2021

* [static setitem] support the index step > 1. tensor_a[::3] = value

* [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value

* [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value

* Add op version.

5b367dab

18 2月, 2021 1 次提交
- H
  Refine fake_interface Error Message (#30981) · cbbe1274
  由 Huihuang Zheng 提交于 2月 18, 2021
```
Refine fake_interface Error Message
```
  cbbe1274
08 2月, 2021 1 次提交
- L
  
  [Static setitem] Support index is ellipsis for setitem in static mode (#30836) · 12c15beb
  由 liym27 提交于 2月 08, 2021
  
  12c15beb
05 2月, 2021 1 次提交
- L
  
  [Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858) · 4a8b8b45
  由 liuyuhui 提交于 2月 05, 2021
  
  4a8b8b45
03 2月, 2021 1 次提交

[CustomOp] Support install as Package and Add load interface (#30798) · e49d0746

由 Aurelius84 提交于 2月 03, 2021

* support setup.py to compile custom op

* move file into paddle.utils.cpp_extension

* support python setup.py install

* refine code style

* Enrich code and add unittest

* Polish code and api doc

* fix cpp_extension not include in package

* fix relative import

* fix os.makedirs exist_ok param compatibility PY2

* add compile flags in test_jit_load

e49d0746

19 1月, 2021 1 次提交
- Z
  [2.0 API] device guard (#30307) · 66c514ce
  由 Zhang Ting 提交于 1月 19, 2021
```
* add 2.0 API: device_guard
```
  66c514ce
13 1月, 2021 1 次提交

Set expected place in child thread for dataloader to avoid costing cuda memory... · 3d015f1c

由 Leo Chen 提交于 1月 13, 2021

Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

3d015f1c

11 1月, 2021 2 次提交

L
Support vector<double> as type of op attribute and op set_value suppport... · b4989fb7
由 liym27 提交于 1月 11, 2021
```
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
```
b4989fb7

Add Static Variable Clone (#30208) · c372a763

由 Huihuang Zheng 提交于 1月 11, 2021

Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat

c372a763

07 1月, 2021 1 次提交
- W
  
  refine the paddle place support using str (#28769) · 7dd551e0
  由 wangchaochaohu 提交于 1月 07, 2021
  
  7dd551e0
31 12月, 2020 1 次提交
- L
  
  fix error message (#30020) · a253a78a
  由 Leo Chen 提交于 12月 31, 2020
  
  a253a78a
26 12月, 2020 1 次提交
- L
  
  [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574) · 4427df37
  由 liuyuhui 提交于 12月 26, 2020
  
  4427df37
24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

23 12月, 2020 1 次提交

[setitem] Support Tensor setitem in static mode (#29708) · 97e75ad0

由 liym27 提交于 12月 23, 2020

1. Type of index: int, slice(step must be 1).

2. Type of value: 
 (1) int32, int64, float32, bool; 
 (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
 (3) paddle.Tensor(int32, int64, float32, float64, bool);

97e75ad0

16 12月, 2020 1 次提交
- L
  
  [Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337) · f13c3a9c
  由 liuyuhui 提交于 12月 16, 2020
  
  f13c3a9c
09 12月, 2020 1 次提交
- Z
  support deepcopy for Layer/Tensor/Paramerbase (#29387) · e74e1a22
  由 Zhou Wei 提交于 12月 09, 2020
```
* support deepcopy for Layer/Tensor/Paramerbase

* fix some code
```
  e74e1a22

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致