提交 · 5dbe9e597c4550d1d95cdbe51911e2a3881f8ff1 · 机器未来 / Paddle

02 12月, 2019 1 次提交

[cherry-pick] Improve topk performance. (#21087) (#21441) · 5dbe9e59

由 zhaoyuchen2018 提交于 12月 02, 2019

* Improve topk performance.

give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.

* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

5dbe9e59

31 7月, 2019 1 次提交
- H
  GPU allocation uses fraction of available memory (#18896) · ea6ee76f
  由 Huihuang Zheng 提交于 7月 31, 2019
```
GPU allocation uses fraction of available memory, also fix the GetUsed without lock
```
  ea6ee76f
21 3月, 2019 1 次提交

add more unittest · 953214ad

由 sneaxiy 提交于 3月 19, 2019

modify allocator strategy
remove changes of legacy buddy_allocator
test=develop

953214ad

19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
04 12月, 2018 1 次提交

[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661) · 29d9fb53

由 Wu Yi 提交于 12月 04, 2018

* wip multi process multi gpu dist training

* workable for p2p

* update test=develop

* change back env name test=develop

* fix alloc init

* fix cpu build test=devlop

* fix mac tests test=develop

* refine code

* refine test=develop

29d9fb53

22 11月, 2018 1 次提交

Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929) · 00b9e9a1

由 chengduo 提交于 11月 22, 2018

* refine cublase
test=develop

* code refine

* refine cublas

* add GEMME_EX

* add enable_cublas_tensor_op_math doc and add cublasCall
test=develop

* fix CublasCall for cuda version
test=develop

* fix error
test=develop

* fix GEMM_EX to be compatible with gcc 4.8
test=develop

* add GEMM_EX
test=develop

* to compatiable with gcc4.8
test=develop

00b9e9a1

15 10月, 2018 1 次提交
- C
  add cuda version display (#13885) · 2c9839c8
  由 chengduo 提交于 10月 15, 2018
```
test=develop
```
  2c9839c8
27 9月, 2018 1 次提交
- T
  Revert "Some trivial optimization (#13530)" · a4f7696a
  由 typhoonzero 提交于 9月 27, 2018
```
This reverts commit 1d91a49d.
```
  a4f7696a
26 9月, 2018 1 次提交

Some trivial optimization (#13530) · 1d91a49d

由 chengduo 提交于 9月 26, 2018

* some trivial opt

* remove the fix of lod_tensor and shrink_rnn_memory_op

* refine ShrinkRNNMemoryOp

test=develop

1d91a49d

23 4月, 2018 1 次提交
- F
  
  Add synchronous TensorCopy and use it in double buffer · 9f11da59
  由 fengjiayi 提交于 4月 23, 2018
  
  9f11da59
08 4月, 2018 2 次提交
- Y
  
  Update (#9717) · 535646cf
  由 Yi Wang 提交于 4月 07, 2018
  
  535646cf
- Y
  Fix cpplint errors with paddle/fluid/platform/gpu_info.* (#9710) · 0c43a376
  由 Yi Wang 提交于 4月 07, 2018
```
* Fix cpplint errors with paddle/fluid/platform/gpu_info.*

* Update
```
  0c43a376
10 3月, 2018 1 次提交
- K
  
  add gpu info func to get compute cap · 1998d5af
  由 Kexin Zhao 提交于 3月 09, 2018
  
  1998d5af
03 3月, 2018 1 次提交
- C
  
  get max threads of GPU · 00e596ed
  由 chengduoZH 提交于 3月 02, 2018
  
  00e596ed
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 1 次提交
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
22 12月, 2017 1 次提交

"remove GPU Sync Interface" (#6793) · abde3130

由 dzhwinter 提交于 12月 22, 2017

* "remove GPU Sync Interface"

* "fix typo"

* "fix type cast error"

* "fix related Copy with stream"

* "fix failed tests with DevicePool"

* "fix stupid removed position error"

abde3130

16 11月, 2017 1 次提交
- D
  "fix accuracy kernel bug" (#5673) · e97b8987
  由 dzhwinter 提交于 11月 15, 2017
```
* "fix accuracy kernel bug"

* "relauch ci"
```
  e97b8987
10 10月, 2017 1 次提交
- L
  
  remove unused PADDLE_ONLY_CPU comment · 871a3f6e
  由 Luo Tao 提交于 10月 10, 2017
  
  871a3f6e
05 10月, 2017 3 次提交

Y

Rename platform::GetDeviceCount into platform::GetCUDADeviceCount · 2b204f04
由 Yi Wang 提交于 10月 04, 2017

2b204f04
Y

Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
由 Yi Wang 提交于 10月 04, 2017

4558807c

Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94

由 Yu Yang 提交于 10月 04, 2017

By shell command

```bash
sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
```

84500f94

26 9月, 2017 1 次提交
- Q
  fix nv_library (#4370) · d0ad82cf
  由 Qiao Longfei 提交于 9月 25, 2017
```
* fix nv_library

* fix symbol in gpu_info.h
```
  d0ad82cf
18 8月, 2017 2 次提交
- L
  
  follow comments · b3ab15a7
  由 liaogang 提交于 8月 18, 2017
  
  b3ab15a7
- L
  
  Add ENVIRONMENT interface interface · 55437b58
  由 liaogang 提交于 8月 18, 2017
  
  55437b58
19 7月, 2017 1 次提交
- L
  
  Add cuda memcpy in gpu_info · b0588641
  由 liaogang 提交于 7月 19, 2017
  
  b0588641
11 7月, 2017 1 次提交
- L
  
  FIX: merge conflicts · 383b96f3
  由 liaogang 提交于 7月 11, 2017
  
  383b96f3
04 7月, 2017 1 次提交
- L
  
  ENH: Add buddy allocator Free · 0ba63475
  由 liaogang 提交于 7月 04, 2017
  
  0ba63475
29 6月, 2017 2 次提交
- L
  
  ENH: Add gpu info interface · 6e7209f0
  由 liaogang 提交于 6月 29, 2017
  
  6e7209f0
- L
  
  ENH: Add Gpu info · d3b77a5b
  由 liaogang 提交于 6月 29, 2017
  
  d3b77a5b
28 6月, 2017 2 次提交
- L
  
  ENH: clang-format · 9490d243
  由 liaogang 提交于 6月 28, 2017
  
  9490d243
- L
  
  ENH: Add cuda.h in platform · dde0da9e
  由 liaogang 提交于 6月 28, 2017
  
  dde0da9e
21 3月, 2017 2 次提交
- L
  
  Move Error out of Common.h · 26716dd2
  由 liaogang 提交于 3月 21, 2017
  
  26716dd2
- L
  
  Add simd check and set SSE3 as default compilation · 6f22951a
  由 liaogang 提交于 3月 21, 2017
  
  6f22951a
05 1月, 2017 2 次提交
- G
  
  Revise common to Common · 72b95533
  由 gangliao 提交于 1月 05, 2017
  
  72b95533
- L
  
  Move Execepts into arch/osx dir · be8b1268
  由 liaogang 提交于 1月 05, 2017
  
  be8b1268
27 12月, 2016 1 次提交
- Y
  
  Fix merge errors. · eca45928
  由 Yu Yang 提交于 12月 27, 2016
  
  eca45928
23 12月, 2016 1 次提交
- L
  
  Add common.h and remove DisableCopy and Typedefs · c8d0791a
  由 liaogang 提交于 12月 23, 2016
  
  c8d0791a
09 12月, 2016 1 次提交
- Y
  
  Change "Baidu, Inc" into "PaddlePaddle Authors" · e9549cbb
  由 Yi Wang 提交于 12月 08, 2016
  
  e9549cbb
22 11月, 2016 1 次提交
- L
  
  clang format .cc .h .cpp .c and .hpp file · 80c68d38
  由 Luo Tao 提交于 11月 22, 2016
  
  80c68d38

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致