提交 · 7f5d532a9c4d2b7355e848d4f7ce847782207eef · PaddlePaddle / Paddle

06 12月, 2019 1 次提交
- Z
  
  refine dev_ctx.Wait() exception throw, test=develop (#21600) · 97e76cb9
  由 Zeng Jinle 提交于 12月 06, 2019
  
  97e76cb9
29 11月, 2019 1 次提交
- J
  
  [MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375) · cd43c444
  由 Jacek Czaja 提交于 11月 29, 2019
  
  cd43c444
18 11月, 2019 1 次提交

fix sporadically hang issue on windows(#21201) · d8b6cf2b

由 liuwei1031 提交于 11月 18, 2019

cudaStreamSynchronize randomly hang when used in multi-thread environment, replace it with cudaStreamQuery API on windows

d8b6cf2b

14 11月, 2019 1 次提交

Improve topk performance. (#21087) · b93870e6

由 zhaoyuchen2018 提交于 11月 13, 2019

* Improve topk performance.

give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.

* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

b93870e6

24 9月, 2019 1 次提交
- Z
  
  fix cuda dev_ctx allocator cmake deps, test=develop (#19953) · 37f76407
  由 Zeng Jinle 提交于 9月 24, 2019
  
  37f76407
22 9月, 2019 1 次提交

Add lock to cudnn handle calls (#19845) · c7f36e7c

由 Zeng Jinle 提交于 9月 22, 2019

* refine reallocate of workspace size, test=develop

* add lock to cudnn handle calls, test=develop

c7f36e7c

11 9月, 2019 1 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

12 8月, 2019 1 次提交
- G
  Polish fleet API to support cuda collective mode and nccl2 mode. (#18966) · 29d87812
  由 gongweibao 提交于 8月 12, 2019
```
Polish fleet API to support cuda collective mode and nccl2 mode
```
  29d87812
11 7月, 2019 1 次提交

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331

由 Tao Luo 提交于 7月 11, 2019

* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop

076f8331

08 7月, 2019 1 次提交

add mkldnn shapeblob cache clear strategy (#18513) · fe32879d

由 Tao Luo 提交于 7月 08, 2019

* add mkldnn shapeblob cache clear strategy

test=develop

* refine with comments

test=develop

* make cache clear strategy more safey

test=develop

* add lock for GetShapeBlobSize

test=develop

fe32879d

03 7月, 2019 1 次提交
- T
  add shape_blob for cache mkldnn primitive (#18454) · 3f3112ce
  由 Tao Luo 提交于 7月 03, 2019
```
test=develop
```
  3f3112ce
02 7月, 2019 1 次提交

rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453) · 8f5fffca

由 Leo Zhao 提交于 7月 02, 2019

* rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id()

test=develop

* update session id definition and adjust logic for default behavior

test=develop

* reset logic in mkldnn reuse as most of cases work in default.

test=develop

8f5fffca

27 6月, 2019 1 次提交
- M
  Reset DeviceContext after quantization warmup (#18182) · 84096932
  由 Michał Gallus 提交于 6月 27, 2019
```
test=develop
```
  84096932
18 6月, 2019 1 次提交
- C
  Remove nccl dep when the number of GPU is 1 (#18158) · 4978db2c
  由 chengduo 提交于 6月 18, 2019
```
* remove nccl dep when the number of GPU is 1
test=develop
```
  4978db2c
10 6月, 2019 1 次提交
- Z
  Remove attribute in Allocator::Allocate (#17878) · 3ece61f7
  由 Zeng Jinle 提交于 6月 10, 2019
```
* remove attribute in Allocator::Allocate, test=develop

* fix travis ci error, test=develop
```
  3ece61f7
07 6月, 2019 1 次提交
- Z
  Fix cuda/cudnn version detection error (#17853) · 3925bd81
  由 Zeng Jinle 提交于 6月 07, 2019
```
* fix cuda/cudnn version detection error, test=develop

* fix again, test=develop
```
  3925bd81
28 3月, 2019 1 次提交
- G
  
  Add DGC(Deep Gradient Compression) interface. (#15841) · eb83abea
  由 gongweibao 提交于 3月 28, 2019
  
  eb83abea
25 3月, 2019 1 次提交
- N
  fix ci bug: cudnn handler in multi card · a1d11bb1
  由 nhzlx 提交于 3月 25, 2019
```
test=develop
```
  a1d11bb1
20 3月, 2019 1 次提交
- N
  
  git cherry-pick from feature/anakin-engine: update anakin subgraph #16278 · 07dcf285
  由 nhzlx 提交于 3月 20, 2019
  
  07dcf285
19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
16 3月, 2019 1 次提交
- Q
  Fix windows compiling (#16230) · 86e912c5
  由 qingqing01 提交于 3月 16, 2019
```
test=develop
```
  86e912c5
15 3月, 2019 1 次提交

Support sync batch norm. (#16121) · 8ad672a2

由 qingqing01 提交于 3月 15, 2019

* Support Sync Batch Norm.
* Note, do not enable it in one device.

Usage:

build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
        loss_name=loss_mean.name,
        build_strategy=build_strategy)

8ad672a2

22 2月, 2019 1 次提交
- S
  Change *(smart_ptr.get()) -> *smart_ptr · 74672d1a
  由 Sylwester Fraczek 提交于 2月 07, 2019
```
reason: dereferencing smart pointer is the same as the underlying pointer
test=develop
```
  74672d1a
19 2月, 2019 1 次提交
- S
  fix many warning · 209b3557
  由 sneaxiy 提交于 2月 19, 2019
```
test=develop
```
  209b3557
16 1月, 2019 1 次提交
- M
  
  Add single GPU support to imperative · 315b133e
  由 minqiyang 提交于 1月 16, 2019
  
  315b133e
11 1月, 2019 3 次提交

C
fix thread safe bug · c4eced98
由 chengduozh 提交于 1月 11, 2019
```
test=develop
```
c4eced98
C
Revert "Remove workspace_handle in conv_cudnn (#15186)" · 358e657f
由 chengduozh 提交于 1月 11, 2019
```
test=develop
This reverts commit 064512aa.
```
358e657f

Remove workspace_handle in conv_cudnn (#15186) · 064512aa

由 chengduo 提交于 1月 10, 2019

* remove workspace_handle in conv2d_cudnn
test=develop

* remove workspace_handle
test=develop

* fix bug
test=develop

* make test_conv2d_op SERIAL
test=develop

* save memory in conv_cudnn
test=develop

* enhance thread safety
test=develop

* enhance temporary allocator
test=develop

* Add excess fraction
test=develop

* follow comments
test=develop

* fix bug and code refine
test=develop

* fix memory size check
test=develop

* rename reuse_tmp_allocation_excess_fraction
test=develop

064512aa

08 1月, 2019 2 次提交
- S
  Revert "Revert "Remove op handle lock"" · ed409ac9
  由 sneaxiy 提交于 1月 08, 2019
```
test=develop
```
  ed409ac9
- Z
  Revert "Remove op handle lock" · dacfaaa9
  由 Zeng Jinle 提交于 1月 08, 2019
```
test=develop
```
  dacfaaa9
07 1月, 2019 1 次提交
- S
  
  fix_cudnn_compatible_check · 9793a0b6
  由 sneaxiy 提交于 1月 07, 2019
  
  9793a0b6
02 1月, 2019 1 次提交
- S
  remove_op_handle_lock · d0a8a1e9
  由 sneaxiy 提交于 1月 02, 2019
```
test=develop
```
  d0a8a1e9
29 12月, 2018 1 次提交
- S
  remove tensor core lock · d25395fc
  由 sneaxiy 提交于 12月 29, 2018
```
test=develop
```
  d25395fc
25 12月, 2018 1 次提交

Move GetTensor to tensor_util (#15011) · b9fb03cf

由 chengduo 提交于 12月 25, 2018

* refine tensor
test=develop

* refine tensor
test=develop

* fix device_context log
test=develop

b9fb03cf

21 12月, 2018 1 次提交

[Feature] Add Temporary Allocator (#14875) · 79bd6dfa

由 chengduo 提交于 12月 21, 2018

* Add Temporal Allocator

* add Temporay Allocator to DeviceContext
test=develop

* code refine
test=develop

* fix mean_iou
test=develop

* Add DeviceTemporaryAllocator
test=develop

* fix conv_op bug
test=develop

* small fix
test=develop

* code refine
test=develop

* log refine
test=develop

* fix unit test
test=develop

* move double check

* refine concat_and_split
test=develop

* add limit_of_temporary_allocation
test=develop

* fix name
test=develop

79bd6dfa

14 12月, 2018 1 次提交
- Y
  
  Fea/fuse conv elementwise add fuse (#14669) · a985949b
  由 Yan Chunwei 提交于 12月 14, 2018
  
  a985949b
10 12月, 2018 1 次提交
- S
  add cuda cudnn version check · 66182abd
  由 sneaxiy 提交于 12月 10, 2018
```
test=develop
```
  66182abd
06 12月, 2018 1 次提交
- S
  fix thread-safety bug · 0f96c2e8
  由 sneaxiy 提交于 12月 05, 2018
```
test=develop
```
  0f96c2e8
05 12月, 2018 1 次提交
- S
  fix deallocate bug · 90076522
  由 sneaxiy 提交于 12月 05, 2018
```
test=develop
```
  90076522
14 11月, 2018 1 次提交
- Y
  
  Refine code · d93b2d03
  由 Yu Yang 提交于 11月 14, 2018
  
  d93b2d03

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功