提交 · d19a9b3954f7e29356410824213806b7e27d37e4 · BaiXuePrincess / Paddle

18 10月, 2021 1 次提交
- T
  [XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph... · d19a9b39
  由 taixiurong 提交于 10月 18, 2021
```
[XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph 3. xpu support update weight params in amp (#36439)
```
  d19a9b39
13 10月, 2021 1 次提交
- Z
  [AMP] add attr is_distributed for layer.to (#36221) · 9a9953d9
  由 zhangbo9674 提交于 10月 13, 2021
```
* add attr is_distributed

* refine code

* refine black/white list for pure fp16
```
  9a9953d9
11 10月, 2021 1 次提交

[NPU] fix matmul_v2 and utils.run_check, test=develop (#36164) · 7850f7ce

由 Qi Li 提交于 10月 11, 2021

* [NPU] fix matmul_v2 and utils.run_check, test=develop

* remove debug files, test=develop

* fix install_check, test=develop

* fix doc, test=develop

* fix review comments, test=develop

7850f7ce

08 10月, 2021 1 次提交
- H
  add python interface of sub_graph (#36120) · a29ff4c7
  由 huangxu96 提交于 10月 08, 2021
```
Add python interface of subgraph: 1. all_sub_graphs() 2. get_sub_graph(idx)
```
  a29ff4c7
27 9月, 2021 1 次提交

support saving model defined parameters without add scale_op (#36119) · 8db6d221

由 Haipeng Wang 提交于 9月 27, 2021

* add scale_op in model save step is not necessary, just fix the prune method to support static graph and inplace op

* fix jit.save, no need to add scale_op to each outputvar anymore.
fix prune_with_input, now it supports inplace op

* temporarily disable test_trt_dynamic_shape.TRTDynamicShapeOutOfBound2Test

* allow user to export parameters defined in model

8db6d221

17 9月, 2021 1 次提交

add inplace op support to prune, scale_op is no longer need in jit.save (#35730) · 21921936

由 Haipeng Wang 提交于 9月 17, 2021

* add scale_op in model save step is not necessary, just fix the prune method to support static graph and inplace op

* fix jit.save, no need to add scale_op to each outputvar anymore.
fix prune_with_input, now it supports inplace op

* temporarily disable test_trt_dynamic_shape.TRTDynamicShapeOutOfBound2Test

21921936

15 9月, 2021 1 次提交

王

clip op extra information when export model. (#35447) · 4d236354

由王明冬提交于 9月 15, 2021

* clip op extra information when export model,test=ocr

* rename clip_extra parameter to kwargs in save_inference_model, test=ocr

4d236354

14 9月, 2021 2 次提交

Intergrate StandaloneExecutor in Static.Executor Interface with... · 4bc08530

由 Aurelius84 提交于 9月 14, 2021

Intergrate StandaloneExecutor in Static.Executor Interface with FLAGS_USE_STANDALONE_EXECUTOR (#35628)

* Intergrate StandaloneExecutor in Static.Executor Interface with FLAGS_USE_STANDALONE_EXECUTOR

* Enhance unittest and clean code in StandaloneExecutor

* polish unittest

4bc08530

Z
add paddle.Tensor api fill_(inplace), zero_(inplace) (#33829) · efeec79b
由 zhiboniu 提交于 9月 14, 2021
```
add fill_ backward
```
efeec79b

10 9月, 2021 1 次提交

Fix warning (#34875) · 966f042d

由 sunzhongkai588 提交于 9月 10, 2021

* fix warning error , test=document_fix

* fix warning error , test=document_fix

* fix warning error , test=document_fix

* fix warning error , test=document_fix

* fix warning error , test=document_fix

* fix warning error , test=document_fix

* fix warning error , test=document_fix

966f042d

08 9月, 2021 1 次提交
- add API Tensor.T for reverse dim of Tensor (#35379) · 2133f3dd
  由 zhouweiwei2014 提交于 9月 08, 2021
  
  2133f3dd
24 8月, 2021 1 次提交

Add auto completion module for auto parallel (#34813) · 93d862b0

由 Yulong Ao 提交于 8月 24, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments
Co-authored-by: Nsandyhouse <lilong12@baidu.com>

93d862b0

18 8月, 2021 1 次提交

Add function to disable paddle signal handler (#34577) · dd533dd3

由 Zhanlue Yang 提交于 8月 18, 2021

* Add function to disable paddle signal handler

Paddle used google::InstallFaultSignalHandler to handle selected system signals,
mainly for debugging and bug report purposes.

However, this can be conflicted with other python packages whoever captures similar signals.
Such python package involves tvm and more

To resolve this issue, we support a function to disable signal handler

* Remove signal test from WIN32 platform

* Remove redundant return from disable_signal_handler() function

* Add detailed messages to en_doc

dd533dd3

17 8月, 2021 1 次提交

Add some passes which can be applied to Program (#34730) · 8046e33d

由 Zeng Jinle 提交于 8月 17, 2021

* add inplace passes and tests

* update

* fix use_cuda undefined
fix compile error of op compat

* add more ut

* fix CPU CI error

* check adam unique

* fix mac/windows ci, improve coverage

* fix ci error

* follow weihang's comment

* fix BlockDesc::MoveFrom

* follow qiuliang's comment

* update

* follow huihuang's comments

8046e33d

11 8月, 2021 1 次提交
- L
  add the basic apis for auto_parallel (#33804) · 3f962e77
  由 lilong12 提交于 8月 11, 2021
```
* add auto_parallel apis
```
  3f962e77
05 8月, 2021 1 次提交

[Dy2Stat]Support Mixed Precision training in @to_static (#34562) · a842828a

由 Aurelius84 提交于 8月 05, 2021

* Support Mixed Precision training in @to_static

* fix block.vars logic

* fix GPU training loss diff

* remove unused code

a842828a

02 8月, 2021 1 次提交

Add basic functions of Program Pass (#34524) · 145cdb5a

由 Zeng Jinle 提交于 8月 02, 2021

* add basic APIs

* add attr_types

* follow comments

* change pass attr types

* add set pass attribute codes

* refine PADDLE_THROW

145cdb5a

28 7月, 2021 1 次提交

graph_to_program save parameter and stop_gradient information (#33771) · 8a7dee31

由 jiangcheng 提交于 7月 28, 2021

This PR added optional boolean is_parameter and stop_gradient in the VarDesc proto, and remove them during save_inference_model

8a7dee31

08 7月, 2021 1 次提交
- W
  delete the function of saving layer object. (#33697) · e22701c4
  由 WeiXin 提交于 7月 08, 2021
```
* delete the function of saving layer object.

* edit doc of paddle.save/load and polish error message
```
  e22701c4
06 7月, 2021 1 次提交
- Z
  
  public api:add bn\ln\in; add static.xpu_place (#33897) · 6b95e674
  由 zhiboniu 提交于 7月 06, 2021
  
  6b95e674
24 6月, 2021 1 次提交
- C
  supplet several interface of static Variable to consistent with dygraph Tensor (#33330) · af9dcb2d
  由 CtfGo 提交于 6月 24, 2021
```
As the title
```
  af9dcb2d
18 6月, 2021 1 次提交
- H
  
  update py head file (#33653) · 1e4e6a36
  由 huzhiqiang 提交于 6月 18, 2021
  
  1e4e6a36
15 6月, 2021 1 次提交
- W
  Save all the information of 'ParamBase' in 'Layer'. (#33500) · 28521e0f
  由 WeiXin 提交于 6月 15, 2021
```
* Save all the information of 'ParamBase' in 'Layer'.

* edit unittest
```
  28521e0f
09 6月, 2021 1 次提交
- W
  cache core.globals() to speed up dynamic graph (#32098) · b4954ce4
  由 wanghuancoder 提交于 6月 09, 2021
```
* modify API nn.Bilinear's doc, test=develop
```
  b4954ce4
07 6月, 2021 1 次提交
- Z
  
  fix too-many-format-args (#33353) · 599e9e48
  由 zhangchunle 提交于 6月 07, 2021
  
  599e9e48
27 5月, 2021 1 次提交
- Q
  
  [ROCM] add is_compiled_with_rocm api, test=develop (#33043) · 6a5b7e59
  由 Qi Li 提交于 5月 27, 2021
  
  6a5b7e59
20 5月, 2021 1 次提交
- L
  
  Polish code for setitem and getitem (#32911) · 848cabfc
  由 liym27 提交于 5月 20, 2021
  
  848cabfc
18 5月, 2021 1 次提交

[Dy2Static] Refactor param_guard logic of @to_static (#32867) · b8d493df

由 Aurelius84 提交于 5月 18, 2021

* Add param_guard in ParameterList to support @to_static

* Refactor param_guard of @to_static

* fix unittest failed

* add more unittest

b8d493df

14 5月, 2021 1 次提交
- L
  
  Polish code for _getitem_impl_ (#32868) · 096b2f5a
  由 liym27 提交于 5月 14, 2021
  
  096b2f5a
12 5月, 2021 1 次提交

add varbasecopy func to fix the ParamBase type bug in layers.to API (#32789) · 067f558c

由 chentianyu03 提交于 5月 12, 2021

* add varbasecopy func to fix the paraBase type bug in layers.to API

* overload _copy_to func for ParamBase

* add xpuplace

* add waiting varbsecopy completion when not blocking

* fix dst_device bug

* modify varbase to shared_ptr

067f558c

06 5月, 2021 1 次提交
- G
  
  Fix bugs of pipeline on ascend. (#32737) · c5ae21f4
  由 gongweibao 提交于 5月 06, 2021
  
  c5ae21f4
28 4月, 2021 1 次提交
- C
  Add fake interface for register_hook in static mode (#32642) · 9aad7527
  由 Chen Weihang 提交于 4月 28, 2021
```
* add fake interface for hook in static mode

* add unittests

* fix failed unittests
```
  9aad7527
27 4月, 2021 1 次提交

[Docs] Modified the docs of some api for supporting list/tuple args. (#32360) · 15158927

由 xiemoyuan 提交于 4月 27, 2021

* fixed docs.

* Fixed docs. test=document_fix

code bak.

fixed docs. test=document_fix

* Revert to previous version of python/paddle/fluid/backward.py

* fixed bugs.

* test=document_fix. Fixed examples.

15158927

26 4月, 2021 1 次提交
- Y
  Unset ReserveSpace of batch_norm for inference program. (#32493) · 202b0eaf
  由 Yiqun Liu 提交于 4月 26, 2021
```
* Unset ReserveSpace for inference program.

* Support training from an inference program.
```
  202b0eaf
25 4月, 2021 2 次提交
- B
  
  add copy_cross_scope (#32432) · 5943ff7b
  由 Baibaifan 提交于 4月 25, 2021
  
  5943ff7b
- Z
  
  add detail for gpu_id, document_fix (#32444) · 136ef09d
  由 Zhang Ting 提交于 4月 25, 2021
  
  136ef09d
21 4月, 2021 1 次提交

【NPU】Merge NPU ccl code (#32381) · c3158527

由 zhang wenhui 提交于 4月 21, 2021

* add allreduce and broadcast without test (#31024)

add allreduce and broadcast without test

* Refactor HCCLCommContext to be compatible with Paddle (#31359)

Refactor HCCLCommContext to be compatible with Paddle (#31359)

* [NPU] add npu kernel for communication op (#31437)

* add allreduce and broadcast without test

* add c_broadcast_test case

* build c_comm_init and c_create_group operators

* make the whole thing compile

* add broadcast and init op test case but run failed

* make unit test compile

* fix broadcast test bug and change into hcom for ccl

* change c_comm_init and c_create_group ops accordingly

* make tests compile

* transfer code to 27

* compiled successfully in 28, but run failed

* test broadcast in 28, but failed

* make hcom primitives work

* change hccl data type for base.h

* fix broadcast bug

* make attributes work

* fix group name bug

* add allreduce but test failed

* allreduce bug for qiuliang

* allreduce finished

* add allgather and reducescatter

* merge all op code

* add allgather test

* finish run all ccl op test exclude send/recv

* all all op and test exclude send/recv

* send_v2_npu.cc recv_v2_npiu.cc compiled

* fix ccl core dump bug and test allgather, reducescatter, broadcast op

* fix allreduce bug just for test

* hcom send&recv test pass, without hcom_destroy

* for qiuliang test

* Ascend Send&Recv Test Pass

* all op (ex send/recv) ok

* fix bug

* merge all ccl op

* style merge to PaddlePaddle

* merge style

* new merge style

* merge style 2

* insert an empty at the end

* disable ctest for hcom to pass ci
Co-authored-by: Nvoid-main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>

* Add auto-increasing tag id for Hcom OPs (#31702)

* add c_reduce_sum op (#31793)

add c_reduce_sum op

* update Ascendrc hccl to 20.3 (#32126)

update Ascendrc hccl to 20.3 (#32126)

* fix merge code

* change cmake.txt1

* [NPU] Support npu kernel for c sync stream op (#31386)

* sync stream npu op

* add with_ascend_acl

* update c++ unittest

* compile all failed

* try to pre commit

* after pre commit

* merge&compile&test hccl successfully!

* fix code style

* fix code style

* fix bugs about hccl

* fix some bugs

* fix code style

* fix style

* fix style

* fix

* fixed

* merge develop
Co-authored-by: Nlw921014 <liuwei921014@yeah.net>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>
Co-authored-by: Nxiayanming <41795079@qq.com>

c3158527

15 4月, 2021 1 次提交
- F
  fix test sync_with_cpp (#32212) · 0c037d2d
  由 fangshuixun007 提交于 4月 15, 2021
```
fix test sync_with_cpp (#32212)
```
  0c037d2d
09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

02 4月, 2021 1 次提交

support save/load single tensor (#31756) · 43367e4b

由 WeiXin 提交于 4月 02, 2021

* support save/load single tensor

* compatibility modification according to unnittest

* Some python2.7 don't have 'copyreg' modules

* Handle a syntax error.

* Dealing with compatibility problems on Mac.

* Dealing with compatibility problems on Mac.

* edit unittest to improve coverage.

* Modify the code according to the review comments

* Reduce redundant code.

* support for static graph loading dygraph state_dict

* edit code according to CI

* edit unittest

* edit unnittest

* delete redundant file

* edit code according to Comments

* edit english doc

* edit english doc

* edit English DOC.

* get/set_tensor->get/set_value; return_numpy=False

* get/set_tensor->get/set_value; return_numpy=False

* edit unnittest

* edit unnittest

* polish code.

43367e4b

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致