提交 · 1ca6ea03186130ba662817d8a9dfe5973cb888f5 · BaiXuePrincess / Paddle

08 9月, 2019 1 次提交
- H
  fix cmakelist deps (#19668) · 1ca6ea03
  由 hutuxian 提交于 9月 08, 2019
```
fix cmakelist deps: remove unnecessary deps and add proper op deps
```
  1ca6ea03
31 8月, 2019 1 次提交

由 hutuxian 提交于 8月 31, 2019

* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982

c756b5d2

19 8月, 2019 1 次提交
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
09 8月, 2019 1 次提交

Add call stack info during compile time (#19067) · 21440b4d

由 chengduo 提交于 8月 09, 2019

* Add call stack info during runtime and compile time
test=develop

* Rename operator_call_stack
test=develop

* Add unit test
test=develop

* follow comment
test=develop

21440b4d

02 8月, 2019 1 次提交

Open gc by default (#18836) · 7ac748ad

由 Zeng Jinle 提交于 8月 02, 2019

* open gc by default, test=develop

* fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop

* fix conditional_block op eager deletion bug, test=develop

* add some comments to reviewers, test=develop

7ac748ad

29 7月, 2019 1 次提交

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

19 7月, 2019 1 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

17 7月, 2019 1 次提交
- G
  remove async executor and add data_feed.proto to the deps of train demo (#18659) · d714bf03
  由 guru4elephant 提交于 7月 17, 2019
```
* remove async executor and add data_feed.proto to the deps of train demo
```
  d714bf03
27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

12 6月, 2019 1 次提交
- H
  
  add trainer_desc proto DEPS (#18019) · f1d458da
  由 hutuxian 提交于 6月 12, 2019
  
  f1d458da
11 6月, 2019 1 次提交

Pipeline Concurrency (#17402) · 969e6378

由 hutuxian 提交于 6月 11, 2019

Add Pipeline Concurrency Train Mode:
- Cpp: pipeline_trainer & section_worker
- Python: PipelineOptimizer
- Add a new data_feed type: PrivateInstantDataFeed
- Add a test demo of pipeline trainer and the test model is gnn
- Do not support win32 now

969e6378

23 5月, 2019 1 次提交

Fix allocator bug (#16712) · c6189637

由 Zeng Jinle 提交于 5月 23, 2019

* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop

c6189637

29 3月, 2019 22 次提交
- L
  fix comments of 16410, test=develop (#16499) · 278debab
  由 liuwei1031 提交于 3月 29, 2019
```
* fix comments of 16410, test=develop

* modify inplace_op_inference_test according to pass interface change, test=develop
```
  278debab
- D
  rebase current develop and fix conflict · 720647e1
  由 dongdaxiang 提交于 3月 29, 2019
```
test=develop
```
  720647e1
- D
  add timer to distributed executor · 241d8808
  由 dongdaxiang 提交于 3月 28, 2019
```
test=develop
```
  241d8808
- D
  add trainer_desc.proto to distributed executor · 3c73859e
  由 dongdaxiang 提交于 3月 28, 2019
```
test=develop
```
  3c73859e
- D
  fix distributed building · 0030eb2a
  由 dongdaxiang 提交于 3月 28, 2019
```
test=develop
```
  0030eb2a
- D
  remove trainer_library in CMakeLists · f39b323e
  由 dongdaxiang 提交于 3月 23, 2019
```
test=develop
```
  f39b323e
- D
  
  fix dataset float32 type problem · f6c9232a
  由 dongdaxiang 提交于 3月 18, 2019
  
  f6c9232a
- X
  
  add dataset factory && fix style · ecfc7df9
  由 xujiaqi01 提交于 3月 13, 2019
  
  ecfc7df9
- D
  
  make Dataset* as an argument · b415ec27
  由 dongdaxiang 提交于 3月 09, 2019
  
  b415ec27
- X
  
  modify c++ and python dataset related code & fix bug · dd67ad08
  由 xjqbest 提交于 3月 09, 2019
  
  dd67ad08
- D
  
  fix some conflict for compilation · cc4def6b
  由 dongdaxiang 提交于 3月 08, 2019
  
  cc4def6b
- H
  
  refactor & fix bug · 9bca1926
  由 heqiaozhi 提交于 3月 08, 2019
  
  9bca1926
- X
  
  add DataSet and InMemoryDataFeed, support load data into memory and shuffle data · 2e9a836c
  由 xjqbest 提交于 3月 06, 2019
  
  2e9a836c
- D
  
  add RunFromDataset in executor · 24863897
  由 dongdaxiang 提交于 3月 08, 2019
  
  24863897
- X
  
  add DataSet and InMemoryDataFeed, support load data into memory and shuffle data · 824b84d1
  由 xjqbest 提交于 3月 06, 2019
  
  824b84d1
- D
  move fs.cc and shell.cc into paddle/fluid/framework/io · 1fe54416
  由 dongdaxiang 提交于 2月 22, 2019
```
test=develop
```
  1fe54416
- D
  
  add fs_local_open example · afaf9370
  由 dongdaxiang 提交于 2月 22, 2019
  
  afaf9370
- D
  
  add printer for fetch variable · cf136064
  由 dongdaxiang 提交于 2月 18, 2019
  
  cf136064
- D
  
  fix ngraph compile option · 54f047a1
  由 dongdaxiang 提交于 2月 02, 2019
  
  54f047a1
- D
  
  add common.h.in back · dd1dc9bc
  由 dongdaxiang 提交于 2月 02, 2019
  
  dd1dc9bc
- D
  refine device_worker and trainer code · c1650120
  由 dongdaxiang 提交于 2月 02, 2019
```
test=develop
```
  c1650120
- D
  
  make -DWITH_PSLIB=ON compilable · 24a80011
  由 dongdaxiang 提交于 1月 28, 2019
  
  24a80011
27 3月, 2019 1 次提交
- S
  delete source file no_need_buffer_vars_inference.cc · a0f4fefb
  由 sneaxiy 提交于 3月 27, 2019
```
test=develop
```
  a0f4fefb
26 3月, 2019 2 次提交
- S
  fix env variable settting bug · 78fb3a62
  由 sneaxiy 提交于 3月 26, 2019
```
test=develop
```
  78fb3a62
- S
  fix some op grad maker · 7000ec85
  由 sneaxiy 提交于 3月 25, 2019
```
fix ctest eager deletion disable bug
test=develop
```
  7000ec85
25 3月, 2019 1 次提交
- S
  split PR · c20db635
  由 sneaxiy 提交于 3月 25, 2019
```
test=develop
```
  c20db635
24 3月, 2019 1 次提交
- S
  add op registry type · a93a9eef
  由 sneaxiy 提交于 3月 22, 2019
```
refine gc code
test=develop
```
  a93a9eef
21 3月, 2019 1 次提交

add more unittest · 953214ad

由 sneaxiy 提交于 3月 19, 2019

modify allocator strategy
remove changes of legacy buddy_allocator
test=develop

953214ad

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致