提交 · 2fa3ce2b4a9b7f2bbd53ae58d0d0b9b27eb00ca2 · 机器未来 / Paddle

26 8月, 2021 2 次提交

Z
Revert "Add copy from tensor (#34406)" · 2fa3ce2b
由 zhangchunle 提交于 8月 26, 2021
```
This reverts commit ac33c0ca.
```
2fa3ce2b

Add copy from tensor (#34406) · ac33c0ca

由 Shang Zhizhou 提交于 8月 26, 2021

* add api

* temp save

* revert

* copytocpu async ok

* fix style

* copy sync ok

* fix compile error

* fix compile error

* api done

* update python async api

* fix compile

* remove async python api; add c++ async unittest

* remove python async api

* update unittest

* update unittest

* add C++ unittest for copytensor

* add unittest

* update namespace utils to class TensorUtils

* add unittest

* update unittest

* update unittest

* update code style

* update code style

* update unittest

ac33c0ca

25 8月, 2021 1 次提交

fix potential tensor leak in tensor.__setitem__ (#35013) · 763b6d91

由 Leo Chen 提交于 8月 25, 2021

* fix index tensor leak in __setitem__

* fix another usage of PyTuple_Pack

* refine code

* refine code

* handle None index

* add Py_DecRef

* revert ut

* refine code

* merge develop

* use RAII

* follow comments

763b6d91

24 8月, 2021 2 次提交

add fetch, test=develop (#35019) · a5060b55

由 wanghuancoder 提交于 8月 24, 2021

* add fetch, test=develop

* fix fetch2op, test=develop

* fix fetch2op, test=develop

* refine, test=develop

* fix fetch ctx, test=develop

* add wait, test=develop

* rename fetch2 to fetch_v2, test=develop

* merge, test=develop

a5060b55

Add auto completion module for auto parallel (#34813) · 93d862b0

由 Yulong Ao 提交于 8月 24, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments
Co-authored-by: Nsandyhouse <lilong12@baidu.com>

93d862b0

23 8月, 2021 4 次提交
- B
  
  [CPU] Enable barrier op upon gloo (#34671) · e8f146a9
  由 Bo Liu 提交于 8月 23, 2021
  
  e8f146a9
- Z
  Support gettiem by Bool index (#35026) · b6dc16cb
  由 zyfncg 提交于 8月 23, 2021
```
* Support getitem by Bool index

* delete some debug info of bool index

* support the case that the shape of bool index is different from indexed tensor
```
  b6dc16cb
- S
  
  set node feature (#34994) · c3efabeb
  由 seemingwang 提交于 8月 23, 2021
  
  c3efabeb
- Z
  add adamw cuda kernel (#35020) · 77a8a394
  由 zhaoyingli 提交于 8月 23, 2021
```
* adamw support cuda

* adamw support cuda
```
  77a8a394
19 8月, 2021 1 次提交

Abstract DeviceEvent to manage cross-platform Event implementation (#34922) · 22da1907

由 Aurelius84 提交于 8月 19, 2021

* add device_context

* add gtest for device_event_gpu

* Remvoe duplicate DeviceType

* push for test

* add unittest

* fix macros

* fix MSVC using usage

22da1907

18 8月, 2021 2 次提交

code refactoring for new executor (#34970) · 40d4d834

由 wanghuancoder 提交于 8月 18, 2021

* code refactoring, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

40d4d834

Add function to disable paddle signal handler (#34577) · dd533dd3

由 Zhanlue Yang 提交于 8月 18, 2021

* Add function to disable paddle signal handler

Paddle used google::InstallFaultSignalHandler to handle selected system signals,
mainly for debugging and bug report purposes.

However, this can be conflicted with other python packages whoever captures similar signals.
Such python package involves tvm and more

To resolve this issue, we support a function to disable signal handler

* Remove signal test from WIN32 platform

* Remove redundant return from disable_signal_handler() function

* Add detailed messages to en_doc

dd533dd3

17 8月, 2021 2 次提交

Copy boost optional to Paddle (#34780) · 9be41447

由 chentianyu03 提交于 8月 17, 2021

* copy boost optional.hpp to paddle

* copy boost optional.hpp to paddle

* move directions

* del fluid/utils

* modify .hpp to .h

* move directions

* modify to paddle::optional

* add modification description

* format code stype for the files in paddle/utils

* format code stype

9be41447

Add some passes which can be applied to Program (#34730) · 8046e33d

由 Zeng Jinle 提交于 8月 17, 2021

* add inplace passes and tests

* update

* fix use_cuda undefined
fix compile error of op compat

* add more ut

* fix CPU CI error

* check adam unique

* fix mac/windows ci, improve coverage

* fix ci error

* follow weihang's comment

* fix BlockDesc::MoveFrom

* follow qiuliang's comment

* update

* follow huihuang's comments

8046e33d

16 8月, 2021 2 次提交

Change the invoking method of settiem by Ellipsis and None index from numpy to... · 2e30134f

由 zyfncg 提交于 8月 16, 2021

Change the invoking method of settiem by Ellipsis and None index from numpy to set_value op (#34911)

* Change invoking mathod of the settiem by Ellipsis and None index from numpy to set_value op

* add none_axes into attr of set_value_op in dygraph mode

2e30134f

add unique_consecutive_op (#34334) · 875cfd57

由 duanboqiang 提交于 8月 16, 2021

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* remove unity build

* add unique_consecutive op

* add unique_consecutive op

* add enable static

* add noqa

* add space line

* add default case.

* add comma

* add space line

* modify unique_consecutive unittest

* optimize ut coverage

* rebase develop

* improve coverage

* update en docs

* update en docs

* update en docs

* update en docs

* update en docs

* update en doc

875cfd57

13 8月, 2021 2 次提交
- Z
  
  fix a bug of slice by none index (#34877) · ff4bdac3
  由 zyfncg 提交于 8月 13, 2021
  
  ff4bdac3
- R
  
  fix npu_finalize (#34857) · 17a99760
  由 ronnywang 提交于 8月 13, 2021
  
  17a99760
12 8月, 2021 2 次提交
- W
  
  [Inference] Inference python api support fp16 (#34676) · 6326c3ef
  由 Wilber 提交于 8月 12, 2021
  
  6326c3ef
- S
  [HybridParallel]Add Recompute for PipeLineParallel (#34607) · 589d13c5
  由 ShenLiang 提交于 8月 12, 2021
```
* add recompute for pp

* add recompute offload

* add recompute partition
```
  589d13c5
11 8月, 2021 2 次提交
- R
  [NPU] add momentum_op_npu and test (#34082) · 9e3e08f0
  由 ronnywang 提交于 8月 11, 2021
```
* add momentum_op_npu and test

* update

* fix hang
```
  9e3e08f0
- L
  add the basic apis for auto_parallel (#33804) · 3f962e77
  由 lilong12 提交于 8月 11, 2021
```
* add auto_parallel apis
```
  3f962e77
10 8月, 2021 1 次提交

copy boost/any.hpp to utils and replace boost::any with self defined any (#34613) · 12892929

由 chentianyu03 提交于 8月 10, 2021

* add any.hpp to utils and replace boost::any with self defined paddle::any

* add copy any.hpp to custom op depends

* modify any.hpp include path

* remove boost from setup.py.in

* add copy any.hpp to custom op depends

* move any.hpp to paddle/utils/ dirs

* move any.h to extension/include direction

* copy utils to right directions

12892929

09 8月, 2021 1 次提交
- Increase the speed of incremental compilation (#34616) · aab4d6e4
  由 zhouweiwei2014 提交于 8月 09, 2021
  
  aab4d6e4
06 8月, 2021 1 次提交
- T
  
  add get xpu version api (#34594) · 8a9dc5dc
  由 TTerror 提交于 8月 06, 2021
  
  8a9dc5dc
05 8月, 2021 1 次提交

New executor dev (#34407) · 012d12b5

由 hong 提交于 8月 05, 2021

* first test version

* add test exec;

* add data transfer; test=develop

* add new exec head;

* add memcpy; test=develop

* add python fetch

* add new test

* add graph node; test=develop

* remove useless new executor test; test=develop

* remove gperf dependency; test=develop

* fix compile bugs; test=develop

* remove useless code; test=develop

* remove useless code; test=develop

* add uni test; test=develop

* polish code; test=develop

* polish code; test=develop

* add interpreter cmakefile; test=develop

* remove useless code; test=develop

012d12b5

04 8月, 2021 1 次提交
- fix API bug of Tensor.cuda (#34416) · 54b6c390
  由 zhouweiwei2014 提交于 8月 04, 2021
  
  54b6c390
03 8月, 2021 1 次提交
- Q
  support Kunlun2 (#34459) · 2d0f3d9b
  由 QingshuChen 提交于 8月 03, 2021
```
* support Kunlun2

* support KL2

* support KL2
```
  2d0f3d9b
02 8月, 2021 2 次提交

Add basic functions of Program Pass (#34524) · 145cdb5a

由 Zeng Jinle 提交于 8月 02, 2021

* add basic APIs

* add attr_types

* follow comments

* change pass attr types

* add set pass attribute codes

* refine PADDLE_THROW

145cdb5a

Fix Inference CE Error by Topo Order (#34521) · 508b40ec

由 Huihuang Zheng 提交于 8月 02, 2021

The comment background message is too long, see details at https://github.com/PaddlePaddle/Paddle/pull/34521

508b40ec

29 7月, 2021 1 次提交

add fix op run order pass (#34427) · 79e758c6

由 Zeng Jinle 提交于 7月 29, 2021

* add fix op run order pass

* add ut for fix_op_run_order

* fix ci error

* improve coverage

* improve coverge again and fix cpu test case

* follow some comments

79e758c6

28 7月, 2021 1 次提交

graph_to_program save parameter and stop_gradient information (#33771) · 8a7dee31

由 jiangcheng 提交于 7月 28, 2021

This PR added optional boolean is_parameter and stop_gradient in the VarDesc proto, and remove them during save_inference_model

8a7dee31

27 7月, 2021 1 次提交

Revert "Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support... · 0dd6a44a

由 Aurelius84 提交于 7月 27, 2021

Revert "Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy for pass (#34181)" (#34348)" (#34384)

This reverts commit 577fdde5.

0dd6a44a

26 7月, 2021 1 次提交

Support getitem by None index in dynamic mode (#34338) · a0bbc992

由 zyfncg 提交于 7月 26, 2021

* Support getitem by ellipsis index in dynamic mode

* change some code style

* Support getitem by none index in dynamic mode

* modify a comments style and remove useless code

a0bbc992

23 7月, 2021 1 次提交

Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy... · 577fdde5

由 Aurelius84 提交于 7月 23, 2021

Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy for pass (#34181)" (#34348)

This reverts commit 609f8225.

577fdde5

22 7月, 2021 3 次提交
- A
  [Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy for pass (#34181) · 609f8225
  由 Aurelius84 提交于 7月 22, 2021
```
* modify into program_id

* fix cache_info declare problem

* fix python int to C long problem

* modify point to reference

* add ENVS
```
  609f8225
- L
  
  enable amp unsupported_fp16_list for npu (#34314) · b0a2f005
  由 Leo Chen 提交于 7月 22, 2021
  
  b0a2f005
- Z
  Support getitem by ellipsis index in dynamic mode (#34267) · 82339ed1
  由 zyfncg 提交于 7月 22, 2021
```
* Support getitem by ellipsis index in dynamic mode

* change some code style
```
  82339ed1
21 7月, 2021 1 次提交
- C
  
  fix cuda Stream record_event bug (#34285) · d953f8a9
  由 chentianyu03 提交于 7月 21, 2021
  
  d953f8a9
19 7月, 2021 1 次提交

Add Cuda event and stream API (#32460) · 9c7f6af5

由 chentianyu03 提交于 7月 19, 2021

* add cuda event and stream api

* add cuda event and stream api

* add get_current_stream api

* add get_current_stream api

* init streams

* modify get_current_stream

* modify get_cuttent_stream

* add synchronize func

* add current_stream doc and test file

* move get_current_stream into CUDA macro

* move CudaEvent into CUDA macro

* move _get_current_stream and _device_synchronize into cuda macro

* modify the macro of cuda stream and event

* add test case for synchronize

* add paddle.devices.cuda module

* event and stream support hip

* add doc for stream and event class

* move cuda stream and event into single pybind

* add cuda_streams_py.cc to cmakelist

* add _device_synchronize and _get_current_stream to core module

* add test case for cudastream and cudaevent

* move __all__ in streams.py

* fix test fail

* add cuda to devices __all__

* fix current_stream doc writing error

* move devices to device direction, and merge device.py into __init__.py

* add required:gpu to sample codes

* remove cuda direction from device/__init__.py

9c7f6af5

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致