提交 · 82d31503643d5a01bab87f1a916498740258725b · 机器未来 / Paddle

28 4月, 2019 2 次提交

Refine dropout gpu memory (#17095) · 28d69d71

由 Zeng Jinle 提交于 4月 28, 2019

* refine_dropout_mem,test=develop

* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)

# This is the 2nd commit message:

Fleet unify distributed training (#16791)

* implement distributed transpiler with fleet
# This is the 3rd commit message:

ParallelDyGraph with GPU collective mode (#16827)

implement dygraph.parallel.DataParallel to hook reduce op.

# This is the 4th commit message:

Init mixed precision training interface (#16856)

* Init mixed precision training interface

* Add fp16 test script

test=develop

* All initializers support float16

test=develop

* Code cleanup & add more code annotations

test=develop

* Update API spec

test=develop

* Add usage example in doc

test=develop

# This is the 5th commit message:

fix reference_count_pass,test=develop (#17060)

test=develop
# This is the 6th commit message:

Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)

* Cache the information of linear interpolation in forward and use it in backward.
test=develop

* Fix cuda kernel.
test=develop

# This is the 7th commit message:

remove unnecessary prepare_data (#17080)

test=develop
# This is the 8th commit message:

fix interpolate cu. test=develop (#17101)

# This is the 9th commit message:

test=develop, double backward leaky_relu (#17067)

backward of backward: leaky_relu
# This is the 10th commit message:

fix fuse optimizer ops (#17102)

test=develop
# This is the 11th commit message:

truncated_gaussian_random supported in distributed training, test=develop (#17091)

# This is the 12th commit message:

 Detailed coordinate description for yolov3 loss (#17007)

* Detailed coordinate description for yolov3 loss

test=develop

* modified api.spec

test=develop

* modified loss name

* fix api.spec

test=develop

* polish description

test=develop

* modified api.spec

test=develop

# This is the 13th commit message:

fix test_weight_decay (#17109)

test=develop
# This is the 14th commit message:

Path flag (#17105)

* fix python/paddle/fluid/__init__.py detecting problems

28d69d71

Use CudnnWorkspaceHandle in exhaustive search (#17082) · b9494058

由 Huihuang Zheng 提交于 4月 28, 2019

1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn.
2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search.

test=develop

b9494058

23 4月, 2019 1 次提交
- Z
  Make conv cudnn workspace size configurable (#17036) · 0c335dcd
  由 Zeng Jinle 提交于 4月 23, 2019
```
* make_conv_cudnn_ws_size_configurable, test=develop

* change std::max to std::min
test=develop
```
  0c335dcd
21 2月, 2019 1 次提交
- X
  add per kernel config and remove const_cast. · 5eb87506
  由 Xin Pan 提交于 2月 21, 2019
```
test=develop
```
  5eb87506
14 1月, 2019 1 次提交
- C
  Revert "Revert "Remove workspace_handle in conv_cudnn (#15186)"" (#15290) · 46d01d79
  由 chengduo 提交于 1月 13, 2019
```
test=develop
This reverts commit 358e657f.
```
  46d01d79
11 1月, 2019 2 次提交

C
Revert "Remove workspace_handle in conv_cudnn (#15186)" · 358e657f
由 chengduozh 提交于 1月 11, 2019
```
test=develop
This reverts commit 064512aa.
```
358e657f

Remove workspace_handle in conv_cudnn (#15186) · 064512aa

由 chengduo 提交于 1月 10, 2019

* remove workspace_handle in conv2d_cudnn
test=develop

* remove workspace_handle
test=develop

* fix bug
test=develop

* make test_conv2d_op SERIAL
test=develop

* save memory in conv_cudnn
test=develop

* enhance thread safety
test=develop

* enhance temporary allocator
test=develop

* Add excess fraction
test=develop

* follow comments
test=develop

* fix bug and code refine
test=develop

* fix memory size check
test=develop

* rename reuse_tmp_allocation_excess_fraction
test=develop

064512aa

10 1月, 2019 1 次提交

[Feature] support mix precision training for resnet (#14899) · fd854183

由 Wu Yi 提交于 1月 10, 2019

* clip softmax for fp16

* updates

* fuse xent support fp16 test=develop

* wip

* wip

* add simple row reduce

* wip fp16 accurate softmax

* add accurate softmax kernel for fp16 test=develop

* update test=develop

* fix cpu build test=develop

* update api.spec test=develop

* follow comments test=develop

* fix build test=develop

* fix trt build test=develop

* fix inference build test=develop

* fix merge test=develop

* update test=develop

* try fix build test=develop

* fix build test=develop

* rename real_exp test=develop

* fortest

* remove hacky kernels test=develop

* clean up test=develop

fd854183

26 11月, 2018 1 次提交
- M
  Revert the changes of VLOG · 53433d7f
  由 minqiyang 提交于 11月 26, 2018
```
test=develop
```
  53433d7f
19 11月, 2018 1 次提交
- Q
  Convolution fusion operator. (#14449) · fd7e6431
  由 qingqing01 提交于 11月 19, 2018
```
* Convolution fusion operator.
* Clean code
test=develop
```
  fd7e6431
13 11月, 2018 1 次提交
- D
  Fix compiling in cuDNN v5. · d2198184
  由 Dang Qingqing 提交于 11月 13, 2018
```
test=develop
```
  d2198184
09 11月, 2018 1 次提交

Exhaustive search for cuDNN conv. (#14286) · abe20923

由 qingqing01 提交于 11月 09, 2018

* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop

abe20923

08 11月, 2018 1 次提交
- M
  Change the origin VLOG level to 10 times · 0c3227a5
  由 minqiyang 提交于 11月 08, 2018
```
Fix code to support cpplint syntax check

test=develop
```
  0c3227a5
07 11月, 2018 3 次提交

Add fp16 backward support (#14202) · a9b5d42d

由 chengduo 提交于 11月 07, 2018

* add fp16 backward support
test=develop

* add sum_op fp16 test

* disable test_dist_save_load
test=develop

* add check_grad for sum

* add unit test for softmax_grad fp16
test=develop

* add scale_op unit test

* add mul_grad_op unit test for fp16

* add cross_entropy_grad and eman_grad unit test for fp16
test=develop

* fix cross_entropy unit test

* add pool2d fp16 unit test

* refine conv2d fp16 unit test
test=develop

* refine activation unit test
test=develop

* fix ci
test=develop

* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop

a9b5d42d

Q
Revert " Exhaustive search for cuDNN conv. (#14043)" · db8c52da
由 qingqing01 提交于 11月 07, 2018
```
This reverts commit ce7d9b07.
```
db8c52da

Exhaustive search for cuDNN conv. (#14043) · ce7d9b07

由 qingqing01 提交于 11月 07, 2018

* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Clean code
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.

ce7d9b07

05 11月, 2018 1 次提交
- D
  
  test=develop · 60f70b17
  由 dzhwinter 提交于 11月 05, 2018
  
  60f70b17
26 10月, 2018 1 次提交
- D
  
  staged. test speed=49ms in 1080. · 09409bad
  由 dzhwinter 提交于 10月 26, 2018
  
  09409bad
25 10月, 2018 1 次提交
- S
  remove_lock_in_some_ops · 5be6f762
  由 sneaxiy 提交于 10月 25, 2018
```
test=develop
```
  5be6f762
16 10月, 2018 1 次提交
- D
  
  fix update to develop hang problem. · e41a3fcd
  由 dzhwinter 提交于 10月 16, 2018
  
  e41a3fcd
04 9月, 2018 2 次提交
- F
  Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"" · 82a1b35b
  由 fengjiayi 提交于 9月 04, 2018
```
This reverts commit 151e169e.
```
  82a1b35b
- G
  
  Revert "Add CudnnHolder and use it in Conv and ConvTranspose op" · 151e169e
  由 guochaorong 提交于 9月 04, 2018
  
  151e169e
31 8月, 2018 1 次提交
- F
  
  make CudnnHolder thread safe · b0aca882
  由 fengjiayi 提交于 8月 31, 2018
  
  b0aca882
30 8月, 2018 1 次提交
- F
  
  use CudnnHolder in conv_cudnn_op · 407ff0bd
  由 fengjiayi 提交于 8月 30, 2018
  
  407ff0bd
17 8月, 2018 1 次提交
- D
  Revert ""cherry picked operators changes" (#12184)" (#12747) · 4069262f
  由 dzhwinter 提交于 8月 17, 2018
```
This reverts commit bf3c3496.
```
  4069262f
16 8月, 2018 1 次提交

"cherry picked operators changes" (#12184) · bf3c3496

由 dzhwinter 提交于 8月 16, 2018

* "cherry picked operators changes"

* "remove duplicated code"

* "add constant setter"

* "add get expected kernel"

* "fix ci"

* "add fill constant"

bf3c3496

31 7月, 2018 1 次提交
- Y
  Fix bug in cudnn_determistic · 040fc1c3
  由 Yu Yang 提交于 7月 31, 2018
```
* Introduced by #11205
```
  040fc1c3
26 7月, 2018 1 次提交
- W
  refine conv cudnn enforce (#12353) · 73fcfc06
  由 Wu Yi 提交于 7月 26, 2018
```
* refine conv cudnn enforce

* update

* update all cudnn ops

* fix
```
  73fcfc06
06 6月, 2018 1 次提交
- D
  Feature/deterministic (#11205) · 7971d4a3
  由 dzhwinter 提交于 6月 06, 2018
```
* "fix deterministic"

* "fix ci"

* "fix init"
```
  7971d4a3
07 5月, 2018 1 次提交
- K
  
  add fp16 support to conv3d · 8b169272
  由 Kexin Zhao 提交于 5月 04, 2018
  
  8b169272
28 4月, 2018 1 次提交
- C
  
  add FLAGS_use_deterministic_algo · c5774e32
  由 chengduoZH 提交于 4月 28, 2018
  
  c5774e32
04 4月, 2018 1 次提交
- K
  
  enable tensor core for conv cudnn · 187ba087
  由 Kexin Zhao 提交于 4月 03, 2018
  
  187ba087
17 3月, 2018 1 次提交
- K
  
  update · 8ebfc153
  由 Kexin Zhao 提交于 3月 16, 2018
  
  8ebfc153
16 3月, 2018 3 次提交
- K
  
  add more tests · e967d19b
  由 Kexin Zhao 提交于 3月 15, 2018
  
  e967d19b
- K
  
  fix test error · a13ec343
  由 Kexin Zhao 提交于 3月 15, 2018
  
  a13ec343
- K
  
  add conv2d fp16 support · e4de5dc3
  由 Kexin Zhao 提交于 3月 15, 2018
  
  e4de5dc3
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
14 1月, 2018 1 次提交

"cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0

由 dzhwinter 提交于 1月 14, 2018

* "unified operators"

* "add CUDNN register"

* "add use cudnn attribute"

* "add attribute"

* "test conv tranpose op"

* "remove duplicated attr"

* "fix op test"

* "add attribute to set cudnn"

* "add more log"

* "need layout op register support"

* "add more log"

* "change GetExpectedKernelType "

* "fix Get attr in conv_op"

* "fix CI"

* "fix tests"

* "removed kernel priority fallback"

* "fix CI"

* "fix stack pointer bug"

* "refine buggy interface"

* "add const cast to save life"

* "fix get_output_with_grad"

* "fix op test with dataformat"

* ""fix pooling

* "fix pooling test"

* "fix CI"

* "fix with_gpu error"

* "add transform needed functional check"

* "fix unpack list error"

* "comment out parallel.do temporary"

* "fix CI"

* "fix compile doc error"

* "make threshold larger"

5ad1aef0

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致