提交 · aba759ba16422abf8cd39ae7e19d24f5997b9ade · Crayon鑫 / Paddle

21 9月, 2020 1 次提交

[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112) · aba759ba

由 Leo Chen 提交于 9月 21, 2020

* support use add instead of sum to do gradient accumulation

* add inplace addto pass

* add grad_add op and inplace addto pass

* remove debug code

* code refine

* fix bug when sereral sum ops inserts at same op_idx

* fix Flags type

* add addto attribute for conv3d

* fix ut

* code clean

* fix type

aba759ba

15 9月, 2020 1 次提交
- W
  
  [Pass Compatible] Bind python compatible. (#27262) · f827665a
  由 Wilber 提交于 9月 15, 2020
  
  f827665a
27 8月, 2020 1 次提交
- L
  [api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552) · 1c681383
  由 lilong12 提交于 8月 27, 2020
```
add collective op for cpu using gloo and paddle.distributed.* apis
```
  1c681383
21 8月, 2020 1 次提交

support Baidu Kunlun AI Accelerator (#25959) · 138ecf24

由 QingshuChen 提交于 8月 21, 2020

* support Baidu AI Accelerator
  * test=kunlun

* minor
 * test=kunlun

* support xpu op in separate file
 * test=kunlun

* update XPU error message and remove duplicated code

 * test=kunlun

* minor
 * test=kunlun

* minor
 * test=kunlun

138ecf24

18 8月, 2020 1 次提交
- Y
  
  add cpu random Generator (#26013) · 23261ff4
  由 yaoxuefeng 提交于 8月 18, 2020
  
  23261ff4
16 8月, 2020 2 次提交
- W
  
  [API2.0] add op for cudnn version query test=develop (#26180) · 0b81d763
  由 wangchaochaohu 提交于 8月 16, 2020
  
  0b81d763
- W
  
  [API2.0] add Device api (set_device and get_device)(#26103) · bb11cbc2
  由 wangchaochaohu 提交于 8月 16, 2020
  
  bb11cbc2
15 8月, 2020 1 次提交

expose and unify the Tensor concepts to the user (#25978) · 6de463d3

由 Zhou Wei 提交于 8月 15, 2020

* expose and unify the Tensor concepts to the user

* expose tensor to user

* add copy place for Tensor

* add copy place for Tensor

* add note

* add macro PADDLE_WITH_CUDA

* remove RUN_TYPE=DIST

* fix some error

6de463d3

13 8月, 2020 1 次提交
- C
  Fix loaded variable suffix repeat error (#26169) · 838e36e9
  由 Chen Weihang 提交于 8月 13, 2020
```
* fix loaded var suffix repeat error

* use new dygraph name for loaded param
```
  838e36e9
06 8月, 2020 1 次提交

add heter ps mode (#25682) · 0cb60c70

由 Thunderbrook 提交于 8月 06, 2020

* add heter ps mode

* code style
test=develop

* add with_pslib
test=develop

* unitest
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* test monitor
test=develop

* prepare trainer
test=develop

* code style
test=develop

0cb60c70

30 7月, 2020 1 次提交

Integrated Trainer of Parameter Server (API add... · caa90a65

由 tangwei12 提交于 7月 30, 2020

Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)

* Integrated Trainer of Parameter Server

caa90a65

07 7月, 2020 1 次提交
- G
  
  Fix typo in interface. (#24779) · 80f1c507
  由 gongweibao 提交于 7月 07, 2020
  
  80f1c507
16 6月, 2020 1 次提交

Monitor Framework (#24079) · 5822862d

由 hutuxian 提交于 6月 16, 2020

* Add a StatValue class in the backend to represent a stat.
* Add a singleton StatRegistry to maintain the collection of stats.
* For the sake of code neatness, we only support type of int and float, which can cover most of the scenarios.

5822862d

08 6月, 2020 1 次提交

Refine error message in pybind folder (#24886) · 6190023a

由 Leo Chen 提交于 6月 08, 2020

* refine err_msg of pybind.cc, test=develop

* refine err_msg in tensor_py.h, test=develop

* refine error msg, test=develop

* fix test_exception, test=develop

* follow comments, test=develop

6190023a

03 6月, 2020 1 次提交

Add crypto python (#24836) · aa47356b

由 Yanghello 提交于 6月 03, 2020

* add crypto helper for paddle, test=develop

* cryptopp.cmake bug fixed, test=develop

* remove debug build type, test=develop

* fixed CMakeLists for new target, test=develop

* fix CI bug, test=develop

* add cmake option flag DWITH_CRYPTO, test=develop

* add crypto api for python, test=develop

* Revert "add crypto api for python, test=develop"

This reverts commit 3a1cfa9d.

* Revert "Add crypto api (#24694)"

This reverts commit 5a7a517c.

* Revert "Revert "Add crypto api (#24694)""

This reverts commit f952b19f.

* fixed cryptopp cmake building error, test=develop

* change WITH_CRYPTO building option to OFF, test=develop

* âfixed cipher test failed, test=develop

* "add crypto api for python, test=develop"

This reverts commit 83fb55c0.

* travis CI bug fixed, test=develop

* fixed test in python3

* test=develop

* fixed unittest, test=develop

aa47356b

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

26 4月, 2020 1 次提交
- Z
  
  fix example code, test=develop, test=document_fix (#24139) · ab8f8fa7
  由 Zhang Ting 提交于 4月 26, 2020
  
  ab8f8fa7
24 4月, 2020 1 次提交
- 石
  
  supports loading model from memory, test=develop (#24098) · 46f3139c
  由石晓伟提交于 4月 24, 2020
  
  46f3139c
19 4月, 2020 1 次提交

Support LoDTensorArray in fetch (#23645) · 2b896c1f

由 guofei 提交于 4月 19, 2020

* Support LoDTEnsorArray in fetch op

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

* Support LoDTensorArray in fetch

test=develop

2b896c1f

10 4月, 2020 1 次提交
- H
  Add AfsAPI in PaddleBox (#23419) · 94a3789f
  由 hutuxian 提交于 4月 10, 2020
```
* Involves AfsAPI to resolve slow downloading.
* Mainly used in PaddleBox
```
  94a3789f
09 4月, 2020 2 次提交

C

api build strategy error polish, test=develop (#23546) · df538439
由 Chen Weihang 提交于 4月 09, 2020

df538439

Remove: NGraph engine from PDPD repository (#23545) · 3baaee9a

由 mozga-intel 提交于 4月 09, 2020

* Remove the NGraph engine from PDPD repository
1. Each operator was removed from the operator's directory
2. Each test was removed from the unittest directory
3. The parallel executor support was removed from the PDPD
4. The CMake file was removed from the PDPD
5. The NG flags were removed from the repository
test=develop

* Remove ngraph from:
1. Cmake file
2. Python file
test=develop

3baaee9a

06 4月, 2020 1 次提交

Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model (#23171) · 75bd3507

由 Chen Weihang 提交于 4月 06, 2020

* static model runner basic implement, test=develop

* add run program op to execute loaded program, test=develop

* refactor static model runner & run program op, test=develop

* reset engine.cc to resolve conflict

* adapt the change of dygraph double grad, test=develop

* refactor impl to solve control flow error, test=develop

* clear debug code, test=develop

* fix ci str compatible error & checkout dygraph grad maker & add example, test=develop

* hide api & add op test, test=develop

* fix run program op test places error, test=develop

* fix program by review comment, test=develop

* delete change var desc name, test=develop

* fix other program by review comment, test=develop

* remove _static_graph_guard, test=develop

* add selectedrows test, test=develop

* remove desc parser, test=develop

* fix detail program, test=develop

* change socpe create & add test, test=develop

75bd3507

03 4月, 2020 2 次提交

G

Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586) · 24a063f6
由 gongweibao 提交于 4月 03, 2020

24a063f6

[feature] prune program by feed and fetch_list automatically (#22474) · a62599a8

由 Leo Chen 提交于 4月 03, 2020

* prune train program by fetch_list, test=develop

* add unittest for prune, test=develop

* fix pruned feed, test=develop

* support ParallelExecutor and feed prune, test=develop

* add comments, test=develop

* update unittest, test=develop

* update unittests, test=develop

* remove debug code, test=develop

* support cond in clone, test=develop

* support cond in prune, test=develop

* support multiple minimize, test=develop

* support cache, test=develop

* fix _copy_param_info_from, test=develop

* support python2 str, test=develop

* remove debug code, test=develop

* fix bug of caching CompiledProgram, test=develop

* fix multi_device issue, test=develop

* tmp

* support tuple in fetch_list and overriding use_prune, test=develop

* dont use nonlocal in python2, test=develop

* remove nonlocal, test=develop

* code clean, test=develop

* code clean, test=develop

* feed list, test=develop

* test adam, test=develop

* follow comments, test=develop

* reduce duplicate code, test=develop

* update comments, test=develop

a62599a8

20 3月, 2020 1 次提交

Reader sequential and inference partial feed (#22699) · acfc9b8a

由 Zeng Jinle 提交于 3月 20, 2020

* sequential reader stage 1, test=develop

* fix ut, test=develop

* fix iterable=False reset bug, add some logs and polish code, test=develop

* inference feed partial data, test=develop

* Turn on keep_order=True for test, test=develop

* enhance ut to test more cases, test=develop

* test commit for reverting

* Revert "test commit for reverting", test=develop

This reverts commit 80aef42e.

* add ut of merged and unmerged results, test=develop

* add more uts for coverages and add en doc of api, test=develop

* follow comments, test=develop

* change note style, test=develop

acfc9b8a

02 3月, 2020 2 次提交

Unmerged fetch list (#22635) · 89cfa491

由 Zhen Wang 提交于 3月 02, 2020

* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.

* add the unit test for fetch_unmerged.

* update ut for multi-card and multi-cpu.

* add the error message and the user suggestion in FetchOpHandle. test=develop

89cfa491

Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541) · 7d8d5734

由 Chen Weihang 提交于 3月 02, 2020

* add lodtensor share memory & serialization, test=develop

* fix windows compile error, test=develop

* deal vartype pickle & fix unittest matching error message, test=develop

* update timeout variable name, test=develop

* refactor memory map implement, test=develop

* clear mmap file discripter when exit unexpectedly, test=develop

* remove the child process fd in advance, test=develop

* remove mmap fds after Queue.put in child process, test=develop

* add hard unittests for register exit func, test=develop

* fix python2 compatibility problem in unittest, test=develop

* fix exception unittest error, test=develop

* polish code based review comment, test=develop

7d8d5734

26 2月, 2020 1 次提交

support cond in clone, test=develop (#22657) · b2c1be85

由 Leo Chen 提交于 2月 26, 2020

* support cond in clone, test=develop

* refine code, test=develop

* refine code, test=develop

* follow comments, test=develop

* refine code, test=develop

b2c1be85

25 2月, 2020 1 次提交

PaddleBox Framework Part2 (#22466) · 175954d8

由 hutuxian 提交于 2月 25, 2020

* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator.
* Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly.
* Remove CPU code in Pull/PushSparse and we will add it back when testing it fully.
* Fix some known issues: such as copying persistable vars after one epoch running.

175954d8

18 2月, 2020 1 次提交
- W
  add flag to control profile level in python API (#22319) · c65c6ae5
  由 wangchaochaohu 提交于 2月 18, 2020
```
* add python flag to control profile level test=develop
```
  c65c6ae5
12 2月, 2020 1 次提交
- T
  fix bug with compiledProgram (#22495) · b0675c81
  由 tangwei12 提交于 2月 12, 2020
```
* add thread barrier for the compiled program
```
  b0675c81
10 2月, 2020 1 次提交

Compile without nccl deps. [2/2] (#22484) · de009152

由 Wilber 提交于 2月 10, 2020

Compile without nccl deps. [1/2]
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

de009152

07 2月, 2020 1 次提交

Enable the detection of subgraph composed of grad ops (#21223) · dcfb6038

由 Yiqun Liu 提交于 2月 07, 2020

* Add the first implememtation of fusion_group op #19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. #21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop

dcfb6038

05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

17 1月, 2020 1 次提交

Implement a common python unittest to test the ir passes. (#22209) · b7cac50b

由 Yiqun Liu 提交于 1月 17, 2020

* Implement a common python unittest to test the ir passes.
test=develop

* Save the results in np.array and support to startup on CPU.
test=develop

* Fix the unittest.
test=develop

* Add check_program to check whether the optimized program is different from the origin one.
test=develop

* Remove the inferface all_ops.
test=develop

* Add exception test in pass_test.
test=develop

b7cac50b

14 1月, 2020 1 次提交
- X
  add collective communication library in fleet (#22211) · e3a457d3
  由 xujiaqi01 提交于 1月 14, 2020
```
* add collective communication library in fleet to replace mpi
* test=develop
```
  e3a457d3
10 1月, 2020 1 次提交

Add bn and relu fuse pass (#22048) · 46189b16

由 Zhen Wang 提交于 1月 10, 2020

* add bn and relu fuse pass

* add op attr assert and dtype assert

* fix some inputs&&outputs bugs for the fused op and pattern.

* add the unittest for fuse_bn_act_pass. test=develop

* use normative enforce statements. test=develop

* add the cpu test. test=develop

* add the support of batch_size=1 for the bn with relu op. test=develop

* add the error type for paddle throws. test=develop

* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop

46189b16

06 1月, 2020 1 次提交
- H
  
  Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029) · dd436156
  由 Huihuang Zheng 提交于 1月 06, 2020
  
  dd436156
18 12月, 2019 1 次提交

Fix Backward Bugs in Conditional Block (#21809) · 557bce77

由 Huihuang Zheng 提交于 12月 18, 2019

The fixed bugs:

1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.

557bce77

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致