提交 · 554d3a71d207b780d430d94bcaaa27142c7296df · 机器未来 / Paddle

28 4月, 2019 2 次提交

Refine dropout gpu memory (#17095) · 28d69d71

由 Zeng Jinle 提交于 4月 28, 2019

* refine_dropout_mem,test=develop

* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)

# This is the 2nd commit message:

Fleet unify distributed training (#16791)

* implement distributed transpiler with fleet
# This is the 3rd commit message:

ParallelDyGraph with GPU collective mode (#16827)

implement dygraph.parallel.DataParallel to hook reduce op.

# This is the 4th commit message:

Init mixed precision training interface (#16856)

* Init mixed precision training interface

* Add fp16 test script

test=develop

* All initializers support float16

test=develop

* Code cleanup & add more code annotations

test=develop

* Update API spec

test=develop

* Add usage example in doc

test=develop

# This is the 5th commit message:

fix reference_count_pass,test=develop (#17060)

test=develop
# This is the 6th commit message:

Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)

* Cache the information of linear interpolation in forward and use it in backward.
test=develop

* Fix cuda kernel.
test=develop

# This is the 7th commit message:

remove unnecessary prepare_data (#17080)

test=develop
# This is the 8th commit message:

fix interpolate cu. test=develop (#17101)

# This is the 9th commit message:

test=develop, double backward leaky_relu (#17067)

backward of backward: leaky_relu
# This is the 10th commit message:

fix fuse optimizer ops (#17102)

test=develop
# This is the 11th commit message:

truncated_gaussian_random supported in distributed training, test=develop (#17091)

# This is the 12th commit message:

 Detailed coordinate description for yolov3 loss (#17007)

* Detailed coordinate description for yolov3 loss

test=develop

* modified api.spec

test=develop

* modified loss name

* fix api.spec

test=develop

* polish description

test=develop

* modified api.spec

test=develop

# This is the 13th commit message:

fix test_weight_decay (#17109)

test=develop
# This is the 14th commit message:

Path flag (#17105)

* fix python/paddle/fluid/__init__.py detecting problems

28d69d71

Use CudnnWorkspaceHandle in exhaustive search (#17082) · b9494058

由 Huihuang Zheng 提交于 4月 28, 2019

1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn.
2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search.

test=develop

b9494058

26 4月, 2019 5 次提交
- X
  Detailed coordinate description for yolov3 loss (#17007) · 7da7881c
  由 xiaoting 提交于 4月 26, 2019
```
* Detailed coordinate description for yolov3 loss

test=develop

* modified api.spec

test=develop

* modified loss name

* fix api.spec

test=develop

* polish description

test=develop

* modified api.spec

test=develop
```
  7da7881c
- C
  fix fuse optimizer ops (#17102) · 794a1958
  由 chengduo 提交于 4月 26, 2019
```
test=develop
```
  794a1958
- C
  test=develop, double backward leaky_relu (#17067) · 258e000b
  由 ceci3 提交于 4月 26, 2019
```
backward of backward: leaky_relu
```
  258e000b
- K
  
  fix interpolate cu. test=develop (#17101) · 10c487eb
  由 Kaipeng Deng 提交于 4月 26, 2019
  
  10c487eb
- T
  remove unnecessary prepare_data (#17080) · aca60e9a
  由 Tao Luo 提交于 4月 26, 2019
```
test=develop
```
  aca60e9a
25 4月, 2019 4 次提交

Speedup roi_perspective_transform op by caching the information of linear... · 55ce36e9

由 whs 提交于 4月 25, 2019

Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)

* Cache the information of linear interpolation in forward and use it in backward.
test=develop

* Fix cuda kernel.
test=develop

55ce36e9

Z
fix reference_count_pass,test=develop (#17060) · 842ded14
由 Zeng Jinle 提交于 4月 25, 2019
```
test=develop
```
842ded14

Init mixed precision training interface (#16856) · beda7825

由 Yibing Liu 提交于 4月 25, 2019

* Init mixed precision training interface

* Add fp16 test script

test=develop

* All initializers support float16

test=develop

* Code cleanup & add more code annotations

test=develop

* Update API spec

test=develop

* Add usage example in doc

test=develop

beda7825

Y
ParallelDyGraph with GPU collective mode (#16827) · 0b07eef1
由 Yan Xu 提交于 4月 25, 2019
```
implement dygraph.parallel.DataParallel to hook reduce op.
```
0b07eef1

24 4月, 2019 1 次提交
- C
  use fast executor as default (#17044) · cc316816
  由 chengduo 提交于 4月 24, 2019
```
test=develop
```
  cc316816
23 4月, 2019 5 次提交

C
Add fuse momenutum ops (#16745) · a2be4b4d
由 chengduo 提交于 4月 23, 2019
```
* Add fuse momenutum ops
```
a2be4b4d
T

load persistables with selected rows, test=develop (#17047) · 13295d90
由 tangwei12 提交于 4月 23, 2019

13295d90
L
fix runtime_context_cache bug when gpu model has an op runs only on cpu · 490e7462
由 luotao1 提交于 4月 23, 2019
```
test=develop
```
490e7462
Z
Make conv cudnn workspace size configurable (#17036) · 0c335dcd
由 Zeng Jinle 提交于 4月 23, 2019
```
* make_conv_cudnn_ws_size_configurable, test=develop

* change std::max to std::min
test=develop
```
0c335dcd

Support backward of backward for Relu and add a new gradient checker by... · c1c2633a

由 qingqing01 提交于 4月 23, 2019

Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. (#16862)

* Support backward of backward and a new gradient checker
* Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package.

1. Add ReluDoubleGradMaker when register relu_grad.
2. Add a new gradient checker by comparing theoretical and numerical Jacobian.  Check double gradients by double_grad_check.

c1c2633a

22 4月, 2019 8 次提交
- T
  
  fix bug in save, test=develop · 45136b1b
  由 tangwei12 提交于 4月 22, 2019
  
  45136b1b
- L
  add doc for memory_optimize, test=develop (#17010) · a770ce06
  由 liuwei1031 提交于 4月 22, 2019
```
* add doc for memory_optimize, test=develop

* update doc, test=develop

* doc update, test=develop
```
  a770ce06
- W
  add parallel build script to ci … (#16901) · d9991dcc
  由 wopeizl 提交于 4月 22, 2019
```
* add parallel build script to ci test=develop
* 1. classify the test case as single card/two cards/multiple cards type
   2. run test case according to the run type
```
  d9991dcc
- J
  
  fix potential hung in generate proposals, test=develop · b2df6de8
  由 jerrywgz 提交于 4月 22, 2019
  
  b2df6de8
- Z
  fix py_reader demo (#16997) · 24923f76
  由 Zeng Jinle 提交于 4月 22, 2019
```
test=develop
```
  24923f76
- Q
  Speed unit testing. (#16978) · ea42e431
  由 qingqing01 提交于 4月 22, 2019
```
* Speed affine_channel_op unit testing
* Add check in tensor_py
* Fix ONLY_CPU Compiling
```
  ea42e431
- J
  
  enhance generate proposal labels, test=develop · d3a66fc6
  由 jerrywgz 提交于 4月 22, 2019
  
  d3a66fc6
- W
  fix nccl wrapper on windows · 51a0243a
  由 wopeizl 提交于 4月 22, 2019
```
test=develop
```
  51a0243a
21 4月, 2019 1 次提交

Refine model gpu memory (#16993) · 1202d3fc

由 Zeng Jinle 提交于 4月 21, 2019

* speedup gc and inplace softmax_with_cross_entropy_grad
test=develop

* refine models gpu mem
Merge skip vars and warning messages of mem opt
remove relu mem opt
test=develop

* follow comments
test=develop

1202d3fc

20 4月, 2019 1 次提交

Support seq len equal to 0 in sequence ops (#16935) · 3c375751

由 Yibing Liu 提交于 4月 20, 2019

* Support seq len equal to 0 in sequence ops

test=develop

* Add more test cases

* Fix some comments

test=develop

* Fix py3 error

test=develop

3c375751

19 4月, 2019 2 次提交
- Y
  Check some shapes only in runtime (#16919) · 36c05d36
  由 Yibing Liu 提交于 4月 19, 2019
```
* Check some shapes only in runtime

test=develop

* Follow review comments

test=develop

* Update API spec
```
  36c05d36
- T
  disable runtime_context_cache pass by default · aa7b975b
  由 Tao Luo 提交于 4月 19, 2019
```
test=develop
```
  aa7b975b
18 4月, 2019 2 次提交
- N
  fix trt anakin subgraph compile rely · bc6b0ca1
  由 nhzlx 提交于 4月 18, 2019
```
test=develop
```
  bc6b0ca1
- G
  
  Polish DGC code (#16818) · cbdb8a17
  由 gongweibao 提交于 4月 18, 2019
  
  cbdb8a17
17 4月, 2019 7 次提交
- L
  
  fix dygraph save/load checkpoint error, test=develop · a7c11979
  由 lujun 提交于 4月 17, 2019
  
  a7c11979
- T
  use multi-thread to speedup CI tests · bc037c13
  由 Tao Luo 提交于 4月 17, 2019
```
test=develop
```
  bc037c13
- T
  fix sampling id op bug (#16909) · 2b61db07
  由 tangwei12 提交于 4月 17, 2019
```
* fix sampling id op bug, test=develop
```
  2b61db07
- K
  fix overflow by int32 mul test=develop (#16794) · c474e7dd
  由 Kevin 提交于 4月 17, 2019
```
* fix overflow by int32 mul test=develop

* fix reference nullptr

* fix codestyle test=develop

* modify to point in ContextProjectFunctor test=develop

* modify to point in ContextProjectFunctor test=develop

* modify . to -> test=develop
```
  c474e7dd
- Y
  Update logical_op.cc · 8cff2b42
  由 Yan Chunwei 提交于 4月 17, 2019
```
test=develop
```
  8cff2b42
- D
  
  fix GPU compile error problem · 2ab2869c
  由 dongdaxiang 提交于 4月 17, 2019
  
  2ab2869c
- D
  add pybind dependency · 466d177d
  由 dongdaxiang 提交于 4月 16, 2019
```
test=develop
```
  466d177d
16 4月, 2019 2 次提交
- T
  fix/positive negative pair op (#16895) · 008fd785
  由 tangwei12 提交于 4月 16, 2019
```
* fix infershape in runtime

* fix infershape in runtime
test=develop

* fix infershape in runtime
```
  008fd785
- X
  add <memory> · 9c6ee7cf
  由 xuezhong 提交于 4月 16, 2019
```
test=develop
```
  9c6ee7cf

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致