提交 · a52357fe2837307b9cf15e2ca45d501e987341d0 · Crayon鑫 / Paddle

15 8月, 2022 1 次提交

[Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe

由 Yulong Ao 提交于 8月 15, 2022

* [Auto Parallel] Move the distributed info from python to c++

* [Auto Parallel] Add dist_attrs for VarDesc and OpDesc

* [Auto Parallel] Add the lost file

* [Auto Parallel] Make the dist attr be unique_ptr

* [Auto Parallel] Add the proto conversion

* [Auto Parallel] Improve the proto support

* [Auto Parallel] Fix the bugs for adding a device or a link

* [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper

* [Auto Parallel] Improve the impl of these dist attrs

* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh

* [Auto Parallel] Fix the unittest problem

* [Auto Parallel] Explicitly add the src file for auto_parallel target

* [Auto Parallel] Add the proto depedency explicitly

* [Auto Parallel] Fix the cmake bug on windows and mac

* [Auto Parallel] Remove the pybind11 header file in process_mesh.h

* [Auto Parallel] Remove unused codes

* [Auto Parallel] Check whether the dist attr is null

* [Auto Parallel] Implement the assign operator for OpDesc explicitly

a52357fe

10 8月, 2022 2 次提交
- L
  fix proto consistency bug (#45017) · 9c98ee3e
  由 Leo Chen 提交于 8月 10, 2022
```
* fix proto bug

* add ut

* reset need_update for var_desc

* refine code

* fix var desc order issue
```
  9c98ee3e
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
19 7月, 2022 1 次提交
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57
26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
04 4月, 2022 1 次提交

Add dropout yaml (#41355) · 1c7001e7

由 hong 提交于 4月 04, 2022

* add dropout slice yaml

* remove useless code

* fix infer shape error

* skip infrt compile for dropout

1c7001e7

30 12月, 2021 1 次提交
- Y
  [Auto parallel] Make sure the id semantics of every var and op unique (#38132) · 5620214e
  由 Yulong Ao 提交于 12月 30, 2021
```
* [Auto parallel] Make the id of var and op unique

* [Auto Parallel] Rename back dist_context to distop_context
```
  5620214e
09 10月, 2021 1 次提交
- Z
  Add const for OpDesc::id() and VarDesc::id() (#36298) · cb620ca6
  由 Zeng Jinle 提交于 10月 09, 2021
```
* add const OpDesc id()

* add const for VarDesc::id()
```
  cb620ca6
18 9月, 2021 1 次提交
- W
  
  trt engine dtor when the last predictor dtor. (#35842) · 8a239ae5
  由 Wilber 提交于 9月 18, 2021
  
  8a239ae5
15 9月, 2021 1 次提交

王

clip op extra information when export model. (#35447) · 4d236354

由王明冬提交于 9月 15, 2021

* clip op extra information when export model,test=ocr

* rename clip_extra parameter to kwargs in save_inference_model, test=ocr

4d236354

24 8月, 2021 1 次提交

Add auto completion module for auto parallel (#34813) · 93d862b0

由 Yulong Ao 提交于 8月 24, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments
Co-authored-by: Nsandyhouse <lilong12@baidu.com>

93d862b0

15 7月, 2021 1 次提交

Class for processing program (#33439) · 85642a0d

由 huangxu96 提交于 7月 15, 2021

This PR creates a class to process the program at the C++ level. Currently, this class has one class method:
GetInputsOutputsInBlock()

85642a0d

26 4月, 2021 1 次提交
- Y
  Unset ReserveSpace of batch_norm for inference program. (#32493) · 202b0eaf
  由 Yiqun Liu 提交于 4月 26, 2021
```
* Unset ReserveSpace for inference program.

* Support training from an inference program.
```
  202b0eaf
24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

23 6月, 2020 1 次提交

[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models (#25097) · b2f5a149

由 Pei Yang 提交于 6月 23, 2020

* Paddle-TensorRT support slim QAT. test=develop

* add comments. test=develop

* use RenameInput instead of ResetInputs. test=develop

b2f5a149

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

31 10月, 2019 1 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

28 3月, 2019 1 次提交
- G
  
  Add DGC(Deep Gradient Compression) interface. (#15841) · eb83abea
  由 gongweibao 提交于 3月 28, 2019
  
  eb83abea
12 12月, 2018 1 次提交
- S
  implement backward · e240ba29
  由 sneaxiy 提交于 12月 12, 2018
```
test=develop
```
  e240ba29
10 12月, 2018 1 次提交
- T
  add HasProtoAttr function in op_desc.h, clean node.h · 067ed70f
  由 Tao Luo 提交于 12月 10, 2018
```
test=develop
```
  067ed70f
28 11月, 2018 1 次提交
- L
  update Opdesc's HasAttr · fe915901
  由 luotao1 提交于 11月 28, 2018
```
test=develop
```
  fe915901
26 10月, 2018 2 次提交
- D
  
  add cudnn back. staged. · 7141debe
  由 dzhwinter 提交于 10月 26, 2018
  
  7141debe
- X
  delete unused codes. · bba0c4a9
  由 Xin Pan 提交于 10月 26, 2018
```
test=develop
```
  bba0c4a9
17 10月, 2018 1 次提交
- X
  remove unused codes · abbfb60c
  由 Xin Pan 提交于 10月 17, 2018
```
test=develop
```
  abbfb60c
24 8月, 2018 1 次提交
- X
  
  small fix of op_desc · abeb71c8
  由 Xin Pan 提交于 8月 23, 2018
  
  abeb71c8
15 8月, 2018 1 次提交
- G
  
  Fix clone() bug. (#12583) · 842fb021
  由 gongweibao 提交于 8月 15, 2018
  
  842fb021
14 8月, 2018 2 次提交
- M
  
  Polish code style · 5338417b
  由 minqiyang 提交于 8月 14, 2018
  
  5338417b
- M
  
  Polish code · ae39709e
  由 minqiyang 提交于 8月 14, 2018
  
  ae39709e
22 6月, 2018 1 次提交
- Y
  
  use optimize block list instead of first optimize block · 56a903d3
  由 Yancey1989 提交于 6月 22, 2018
  
  56a903d3
31 5月, 2018 1 次提交
- F
  
  fix bugs · a3aca2a3
  由 fengjiayi 提交于 5月 31, 2018
  
  a3aca2a3
22 5月, 2018 2 次提交
- Y
  
  Refine code · fb370f44
  由 yuyang18 提交于 5月 22, 2018
  
  fb370f44
- Y
  
  Fix bug · 03e4da6d
  由 yuyang18 提交于 5月 22, 2018
  
  03e4da6d
25 4月, 2018 1 次提交
- A
  
  Fix CPPLint errors with op_desc · edd3587e
  由 Abhinav Arora 提交于 4月 24, 2018
  
  edd3587e
19 4月, 2018 1 次提交

Fix a bug in save_inference_model and prune when the program is initailized by... · 598035f9

由 Yiqun Liu 提交于 4月 19, 2018

Fix a bug in save_inference_model and prune when the program is initailized by load_inference_model (#10011)

* Fix bug in save_inference_model and prune when the program is initialized by load_inference_program.

* Save the transpiled program instead.

598035f9

26 2月, 2018 1 次提交
- X
  
  Extend current profiler for timeline and more features. · b9ec24c6
  由 Xin Pan 提交于 2月 24, 2018
  
  b9ec24c6
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
23 1月, 2018 1 次提交

Memory optimization on Dynamic RNN (#7599) · d76fcb6f

由 QI JUN 提交于 1月 23, 2018

* limit variable type to lod tensor in memory optimization transpiler

* refine policy

* support while operator

* fix random seed and training data order

* refine get_cfgs method to support multi while operators

* refine codes

d76fcb6f

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致