提交 · 4e8bc02461826b7b62919e5e9ba0833027b82859 · PaddlePaddle / Paddle

03 3月, 2020 1 次提交
- Z
  add fluid.device_guard to specify the device type for Op (#22254) · 4e8bc024
  由 Zhang Ting 提交于 3月 03, 2020
```
* add fluid.device_guard to specify the device type for Op
```
  4e8bc024
12 11月, 2019 1 次提交
- W
  
  Fix dgc buffer illegal & reuse velocity (#21012) · de5d3ff6
  由 WangXi 提交于 11月 12, 2019
  
  de5d3ff6
29 7月, 2019 1 次提交

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

02 7月, 2019 1 次提交

supports collective training with programs (#18392) · a873fa84

由 Yi Liu 提交于 7月 02, 2019

1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops
2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext
3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis

a873fa84

27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

08 5月, 2019 1 次提交
- C
  Code Clean: Move all pass to paddle::framework::ir (#17228) · 04bd413a
  由 chengduo 提交于 5月 08, 2019
```
* move pass to ir

* polish code
test=develop

* fix dependency
test=develop
```
  04bd413a
21 4月, 2019 1 次提交

Refine model gpu memory (#16993) · 1202d3fc

由 Zeng Jinle 提交于 4月 21, 2019

* speedup gc and inplace softmax_with_cross_entropy_grad
test=develop

* refine models gpu mem
Merge skip vars and warning messages of mem opt
remove relu mem opt
test=develop

* follow comments
test=develop

1202d3fc

18 4月, 2019 1 次提交
- G
  
  Polish DGC code (#16818) · cbdb8a17
  由 gongweibao 提交于 4月 18, 2019
  
  cbdb8a17
08 1月, 2019 1 次提交
- P
  
  add the python callstack for debug support test=develop · a6f5ceee
  由 peizhilin 提交于 1月 08, 2019
  
  a6f5ceee
26 12月, 2018 1 次提交
- P
  Revert "cherry-pick the #12759" · 2388d0e7
  由 peizhilin 提交于 12月 26, 2018
```
test=develop

This reverts commit 7f6d8ace.
```
  2388d0e7
25 12月, 2018 1 次提交
- P
  cherry-pick the #12759 · 7f6d8ace
  由 peizhilin 提交于 12月 25, 2018
```
test=develop
```
  7f6d8ace
08 11月, 2018 1 次提交

Fix input<tensor> (#14208) · c5b6573a

由 chengduo 提交于 11月 08, 2018

* fix input<tensor>
test=develop

* fix split_ids
test=develop

* ElementwiseMul should not support SelectedRows

* fix scale op
test=develop

* change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar()

* fix operator

* refine MultiOutput

* fix MultiOutput
test=develop

* disable test_dist_save_load
test=develop

* fix elementwise_op
test=develop

* add get_sparse_as_op
test=develop

* add info for check
test=develop

* rename get_sparse_as_op with extract_rows_as_op.
test=develop

* elementwise doesn't support selected_rows

* fix regularizer

* remove extract_rows_as
test=develop

* fix ci
test=develop

* add test for sum_op

* fix regularizer
test=develop

*  test=develop

* fix pserver weight decay multi inputs test=develop

c5b6573a

30 9月, 2018 1 次提交
- Y
  Revert "Merge pull request #13201 from reyoung/revert_callstack" (#13697) · 186b2b13
  由 Yu Yang 提交于 9月 30, 2018
```
This reverts commit 21bb9e91, reversing
changes made to 3fa68dc1.

test=develop
```
  186b2b13
21 9月, 2018 1 次提交

[Feature] dist op role and lr op role, to support memory optimize with dist training (#13220) · 29c63d18

由 Wu Yi 提交于 9月 21, 2018

* wip

* clean up

* should fix running with memopt

* add ut

* mark lr schedule op role

* hide lr_schedule_guard

* use op_role_var instead of ufind

* unify dist test name

* wip for py3 support

* fix var deref

* fix python3 mem_opt order

* remove comments

29c63d18

16 9月, 2018 1 次提交
- Y
  
  Revert changes for debug · 1c87558c
  由 Yibing Liu 提交于 9月 16, 2018
  
  1c87558c
14 9月, 2018 1 次提交
- Y
  
  Get sequence length in sequence_pad op & fix sequence_mask op · f6595811
  由 Yibing Liu 提交于 9月 14, 2018
  
  f6595811
04 9月, 2018 1 次提交
- Y
  Revert "Revert "Add Python Callstacks when Op::Run error (#12759)"" · cda7842e
  由 Yu Yang 提交于 9月 04, 2018
```
This reverts commit 1f270275.
```
  cda7842e
29 8月, 2018 1 次提交
- X
  
  allow to use name_scope for debugging and visiualization · 51ef0ad7
  由 Xin Pan 提交于 8月 28, 2018
  
  51ef0ad7
23 8月, 2018 3 次提交

G
Revert "Add Python Callstacks when Op::Run error (#12759)" · 1f270275
由 guochaorong 提交于 8月 23, 2018
```
This reverts commit b2df1700.
```
1f270275

Resovle multi gpu async deps (#12828) · b8da70c3

由 Wu Yi 提交于 8月 23, 2018

* dist transpiler add control dependency var between send and recv

* fix async deps

* follow comments and refine

* fix deps connect for rpc ops

b8da70c3

Add Python Callstacks when Op::Run error (#12759) · b2df1700

由 Yu Yang 提交于 8月 23, 2018

* Add Python Callstacks when Op::Run error

* Skip op with sub-block

* refactor: refine callstack info's format

* Reshape only support matrix

* Polish Python code

* Fix UT

* Fix Py3

b2df1700

29 5月, 2018 1 次提交
- Y
  
  singleton rpc_client · 20c24c05
  由 Yancey1989 提交于 5月 29, 2018
  
  20c24c05
15 5月, 2018 1 次提交
- Y
  
  Add op role · 017bba16
  由 yuyang18 提交于 5月 15, 2018
  
  017bba16
07 4月, 2018 1 次提交
- Y
  Fix cpplint errors of paddle/fluid/pybind and add some tests (#9694) · 1543c4cf
  由 Yi Wang 提交于 4月 06, 2018
```
* cpplint test and add tesnor_py_test.cc

* Update

* Update
```
  1543c4cf
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
05 1月, 2018 1 次提交

Feature/use cudnn (#7141) · 5593858d

由 dzhwinter 提交于 1月 05, 2018

* "add c++ side kernel selection"

* "add multiple kernel op test"

* "kernel selection only support cudnn"

* "better formatter"

* "small fix with UseCPU"

* "depends on change interface Get(Place, Library)"

* "fix CI"

* "fix python cudnn test"

* "leave the register cudnn op to another PR"

* "fix CI"

* "use all kernel by default"

* "fix CI"

5593858d

25 12月, 2017 1 次提交

Impl kernel hint (#6883) · af0c4c45

由 Qiao Longfei 提交于 12月 25, 2017

* init kernel hint

* fix typo

* rm unused code

* add include in op_kernel.h

* restore op_kernel since it will be moved to op_kernel_type

* change force_cpu to use_cpu

* fix compilation

af0c4c45

19 12月, 2017 1 次提交
- Q
  
  export const value to python · 5c530ea8
  由 qiaolongfei 提交于 12月 19, 2017
  
  5c530ea8
25 5月, 2017 1 次提交
- Y
  
  Remove not necessary functionalities in Parameter · 273e3f44
  由 Yu Yang 提交于 5月 25, 2017
  
  273e3f44
09 12月, 2016 1 次提交
- Y
  
  Change "Baidu, Inc" into "PaddlePaddle Authors" · e9549cbb
  由 Yi Wang 提交于 12月 08, 2016
  
  e9549cbb
22 11月, 2016 1 次提交
- L
  
  clang format .cc .h .cpp .c and .hpp file · 80c68d38
  由 Luo Tao 提交于 11月 22, 2016
  
  80c68d38
29 8月, 2016 1 次提交

fix dash and space bug, · b72beee4

由 zhangjinchao01 提交于 8月 29, 2016

ISSUE=4586495

git-svn-id: https://svn.baidu.com/idl/trunk/paddle@1408 1ad973e4-5ce8-4261-8a94-b56d1f490c56

b72beee4

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功