提交 · 24a063f6ac0ba1122b5b6bec524c6ec659197e5f · BaiXuePrincess / Paddle

03 4月, 2020 2 次提交

G

Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586) · 24a063f6
由 gongweibao 提交于 4月 03, 2020

24a063f6

[feature] prune program by feed and fetch_list automatically (#22474) · a62599a8

由 Leo Chen 提交于 4月 03, 2020

* prune train program by fetch_list, test=develop

* add unittest for prune, test=develop

* fix pruned feed, test=develop

* support ParallelExecutor and feed prune, test=develop

* add comments, test=develop

* update unittest, test=develop

* update unittests, test=develop

* remove debug code, test=develop

* support cond in clone, test=develop

* support cond in prune, test=develop

* support multiple minimize, test=develop

* support cache, test=develop

* fix _copy_param_info_from, test=develop

* support python2 str, test=develop

* remove debug code, test=develop

* fix bug of caching CompiledProgram, test=develop

* fix multi_device issue, test=develop

* tmp

* support tuple in fetch_list and overriding use_prune, test=develop

* dont use nonlocal in python2, test=develop

* remove nonlocal, test=develop

* code clean, test=develop

* code clean, test=develop

* feed list, test=develop

* test adam, test=develop

* follow comments, test=develop

* reduce duplicate code, test=develop

* update comments, test=develop

a62599a8

20 3月, 2020 1 次提交

Reader sequential and inference partial feed (#22699) · acfc9b8a

由 Zeng Jinle 提交于 3月 20, 2020

* sequential reader stage 1, test=develop

* fix ut, test=develop

* fix iterable=False reset bug, add some logs and polish code, test=develop

* inference feed partial data, test=develop

* Turn on keep_order=True for test, test=develop

* enhance ut to test more cases, test=develop

* test commit for reverting

* Revert "test commit for reverting", test=develop

This reverts commit 80aef42e.

* add ut of merged and unmerged results, test=develop

* add more uts for coverages and add en doc of api, test=develop

* follow comments, test=develop

* change note style, test=develop

acfc9b8a

02 3月, 2020 2 次提交

Unmerged fetch list (#22635) · 89cfa491

由 Zhen Wang 提交于 3月 02, 2020

* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.

* add the unit test for fetch_unmerged.

* update ut for multi-card and multi-cpu.

* add the error message and the user suggestion in FetchOpHandle. test=develop

89cfa491

Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541) · 7d8d5734

由 Chen Weihang 提交于 3月 02, 2020

* add lodtensor share memory & serialization, test=develop

* fix windows compile error, test=develop

* deal vartype pickle & fix unittest matching error message, test=develop

* update timeout variable name, test=develop

* refactor memory map implement, test=develop

* clear mmap file discripter when exit unexpectedly, test=develop

* remove the child process fd in advance, test=develop

* remove mmap fds after Queue.put in child process, test=develop

* add hard unittests for register exit func, test=develop

* fix python2 compatibility problem in unittest, test=develop

* fix exception unittest error, test=develop

* polish code based review comment, test=develop

7d8d5734

26 2月, 2020 1 次提交

support cond in clone, test=develop (#22657) · b2c1be85

由 Leo Chen 提交于 2月 26, 2020

* support cond in clone, test=develop

* refine code, test=develop

* refine code, test=develop

* follow comments, test=develop

* refine code, test=develop

b2c1be85

25 2月, 2020 1 次提交

PaddleBox Framework Part2 (#22466) · 175954d8

由 hutuxian 提交于 2月 25, 2020

* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator.
* Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly.
* Remove CPU code in Pull/PushSparse and we will add it back when testing it fully.
* Fix some known issues: such as copying persistable vars after one epoch running.

175954d8

18 2月, 2020 1 次提交
- W
  add flag to control profile level in python API (#22319) · c65c6ae5
  由 wangchaochaohu 提交于 2月 18, 2020
```
* add python flag to control profile level test=develop
```
  c65c6ae5
12 2月, 2020 1 次提交
- T
  fix bug with compiledProgram (#22495) · b0675c81
  由 tangwei12 提交于 2月 12, 2020
```
* add thread barrier for the compiled program
```
  b0675c81
10 2月, 2020 1 次提交

Compile without nccl deps. [2/2] (#22484) · de009152

由 Wilber 提交于 2月 10, 2020

Compile without nccl deps. [1/2]
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

de009152

07 2月, 2020 1 次提交

Enable the detection of subgraph composed of grad ops (#21223) · dcfb6038

由 Yiqun Liu 提交于 2月 07, 2020

* Add the first implememtation of fusion_group op #19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. #21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop

dcfb6038

05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

17 1月, 2020 1 次提交

Implement a common python unittest to test the ir passes. (#22209) · b7cac50b

由 Yiqun Liu 提交于 1月 17, 2020

* Implement a common python unittest to test the ir passes.
test=develop

* Save the results in np.array and support to startup on CPU.
test=develop

* Fix the unittest.
test=develop

* Add check_program to check whether the optimized program is different from the origin one.
test=develop

* Remove the inferface all_ops.
test=develop

* Add exception test in pass_test.
test=develop

b7cac50b

14 1月, 2020 1 次提交
- X
  add collective communication library in fleet (#22211) · e3a457d3
  由 xujiaqi01 提交于 1月 14, 2020
```
* add collective communication library in fleet to replace mpi
* test=develop
```
  e3a457d3
10 1月, 2020 1 次提交

Add bn and relu fuse pass (#22048) · 46189b16

由 Zhen Wang 提交于 1月 10, 2020

* add bn and relu fuse pass

* add op attr assert and dtype assert

* fix some inputs&&outputs bugs for the fused op and pattern.

* add the unittest for fuse_bn_act_pass. test=develop

* use normative enforce statements. test=develop

* add the cpu test. test=develop

* add the support of batch_size=1 for the bn with relu op. test=develop

* add the error type for paddle throws. test=develop

* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop

46189b16

06 1月, 2020 1 次提交
- H
  
  Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029) · dd436156
  由 Huihuang Zheng 提交于 1月 06, 2020
  
  dd436156
18 12月, 2019 1 次提交

Fix Backward Bugs in Conditional Block (#21809) · 557bce77

由 Huihuang Zheng 提交于 12月 18, 2019

The fixed bugs:

1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.

557bce77

11 12月, 2019 1 次提交
- M
  add `no_need_buffer_slots` interface to pybind (#21575) · 686f0ecb
  由 mapingshuo 提交于 12月 11, 2019
```
* add no_need_buffer_slots interface to pybind
```
  686f0ecb
06 12月, 2019 1 次提交

add file check_op_desc.py and add interface to get default value. (#21530) · 9da7e6b4

由 liym27 提交于 12月 06, 2019

* add file check_op_desc.py and add interface to get default value. test=develop

* add test for c++ coverage rate. test=develop

* Correct typo. test=develop

9da7e6b4

05 12月, 2019 2 次提交

Z

add grad maker assert, test=develop (#21564) · 3a7caf48
由 Zeng Jinle 提交于 12月 05, 2019

3a7caf48

Split VarBase from Python Variable for Dygraph (#21359) · cdd46d7e

由 Leo Chen 提交于 12月 05, 2019

* test=develop, fix docker with paddle nccl problem

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

* add variable method to varbase and refactor to_variable to support return varbase

* support kwargs in varbase constructor

* add VarBase constructor to support default python args

* refine varbase initial method

* reset branch

* fix ut for change VarBase error info to PaddleEnforce

* cherry is parameter change before

* overload isinstance to replace too many change of is_variable

* rm useless files

* rm useless code merged by git

* test=develop, fix some ut failed error

* test=develop, fix test_graph_wrapper

* add some tests, test=develop

* refine __getitem__, test=develop

* add tests, test=develop

* fix err_msg, test=develop

cdd46d7e

04 12月, 2019 1 次提交

Add get_all_kernels api of registered data_type in pybind.cc (#21499) · 54382ce4

由 Aurelius84 提交于 12月 04, 2019

* add _get_all_register_op_kernels api test=develop

* refine usage of check_op_register_type test=develop

* add import in core test=develop

54382ce4

27 11月, 2019 1 次提交

Support numpy bridge (enabled by default in dygraph mode) (#20983) · d5ff79e5

由 Youwei Song 提交于 11月 27, 2019

* add numpy bridge

* fix template compile

* add unittest, add default
test=develop

* fix unittest
test=develop

* fix unittest
test=develop

* zero_copy=True for to_variable,
test=develop

* bug fix
test=develop

* disable deprecated NumPy API
test=develop

* use better design of NumpyAllocator
test=develop

* fix Py_None check
test=develop

* reset c++ tracer when jump out dygraph guard
test=develop

* refine PADDLE_ENFORCE_xx format
test=develop

* bug fix of tracer switch
test=develop

* update decref
test=develop

d5ff79e5

25 11月, 2019 1 次提交
- Z
  Add global value getter setter (#21285) · b9f8ae84
  由 Zeng Jinle 提交于 11月 25, 2019
```
* add global value getter setter, test=develop

* fix error messages, test=develop
```
  b9f8ae84
24 11月, 2019 1 次提交

Refactor fetch handler (#21264) · 691ced87

由 Dong Daxiang 提交于 11月 24, 2019

* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.

For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.

691ced87

14 11月, 2019 1 次提交

Add friendly dygraph trace API (#21091) · 5fdfbe34

由 Zeng Jinle 提交于 11月 14, 2019

* friendly trace interface, test=develop

* refine TracedLayer, test=develop

* add some docs, test=develop

5fdfbe34

01 11月, 2019 1 次提交

Update Tensor.set() to support float16 (#19964) · 9974e407

由 Leo Chen 提交于 11月 01, 2019

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

9974e407

31 10月, 2019 1 次提交

Refine the cache of program, context and scope in executor. (#18483) · 16e4d026

由 Yiqun Liu 提交于 10月 31, 2019

* Refine the cache of program, context and scope in executor.
test=develop

* Refine the unittest test_executor_and_use_program_cache.

* Add the test the PaddingRNN with use_program_cache=True.
test=develop

* Remove a check.
test=develop

* Refine the unittest to check whether it is correct when setting use_program_cache=True.
test=develop

16e4d026

29 10月, 2019 1 次提交

save load problem fix and new feature add (#20823) · ff0886a9

由 hong 提交于 10月 29, 2019

* fix persistable;

* fix save load bugs; test=develop

* fix bug; test=develop

* add example for new io api; test=develop

* addd example; test=develop

ff0886a9

18 10月, 2019 1 次提交
- W
  
  Fix dgc nan by stripping nccl from sparseReduce. (#20630) · 507afa8a
  由 WangXi 提交于 10月 17, 2019
  
  507afa8a
14 10月, 2019 2 次提交

Dlpack support (#20039) · 12e4be03

由 633WHU 提交于 10月 14, 2019

* support dlpack to tensor and implement python interface test=develop

* add unittest for _to_dlpack and from_dlpack test=develop

12e4be03

Refine py_reader exit (#20331) · 40effc61

由 Zeng Jinle 提交于 10月 14, 2019

* refine py_reader exit, test=develop

* fix multiprocess_reader exception unittest, test=develop

* increase code coverage for legacy fluid.layers.py_reader, test=develop

40effc61

11 10月, 2019 3 次提交

update the api en doc of BuildStrategy (#20445) · f855a86c

由 liu zhengxi 提交于 10月 11, 2019

* update the api en doc of BuildStrategy and its setting, test=develop, test=document_fix

* update api.spec, test=develop, test=document_fix

* update the en doc of fuse_relu_depthwise_conv, test=develop, test=document_fix

f855a86c

T
doc fix, test=develop, test=document_fix (#20239) · a010d883
由 tangwei12 提交于 10月 11, 2019
```
* doc fix, test=develop, test=document_fix
```
a010d883

Update en APIs of LoDTensor (#20115) · 5a7142ac

由 Leo Chen 提交于 10月 11, 2019

* polish en APIs of LodTensor, test=develop, test=document_dix

* polish en APIs of LoDTensor, test=develop, test=document_fix

* follow comments, test=develop, test=document_dix

5a7142ac

10 10月, 2019 2 次提交

New save load interface (#20148) · fa43e80e

由 hong 提交于 10月 10, 2019

* add new save load interface; test=develop

* add new save interface; test=develop

* add save load interface ;

* fix save load error;

* fix dygraph set dict bug;

* add save load unit test; test=develop

* fix test_imperative_optimizer bug; test=develop

* fix unitest optimizer bug; test=develop

* fix code coverage; test=develop

* fix converage; test=develop

* add document for apis; test=develop

* fix unitest error; test=develop

* fix save load unit test error; test=develop

* fix error message; test=develop

* change set_parameter set_optimizer to save_dygraph; test=develop

* add load_graph check; test=develop

* fix api spec; test=develop

fa43e80e

Polish en doc of LoDTensorArray, test=document_fix (#19972) · f4c56e9f

由 Leo Chen 提交于 10月 10, 2019

* Polish en doc of LoDTensorArray, test=develop, test=document_fix

* follow comments, test=develop, test=document_dix

f4c56e9f

09 10月, 2019 1 次提交

refine CUDA CPU places en doc (#20243) · 20f68916

由 Youwei Song 提交于 10月 09, 2019

* fix CUDA CPU places, test=document_fix, test=develop

* fix CUDAPlace param doc, test=document_fix, test=develop

* fix CUDAPlace param doc, test=document_fix, test=develop

20f68916

07 10月, 2019 1 次提交
- T
  trainer from dataset fetch targets (#19760) · c9139c3d
  由 tangwei12 提交于 10月 07, 2019
```
add executor.FetchHandler for train/infer from the dataset
```
  c9139c3d
28 9月, 2019 1 次提交

Enable users to create custom cpp op outside framework. (#19256) · 1a3eef02

由 qingqing01 提交于 9月 28, 2019

* How to write custom op needs to follow framework OP spec.
* Package fluid_framework.so and headers into whl.
* Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
* Export some C-APIs to merge OpInfo between core.so and custom_op.so.
* Add unit testing.
* Update API.spec.

1a3eef02

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致