提交 · d48320777e244efaec898548d4b196fdd5c1eae8 · BaiXuePrincess / Paddle

03 1月, 2020 1 次提交

Add the first implememtation of fusion_group op (#19621) · d4832077

由 Yiqun Liu 提交于 1月 03, 2020

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Refine the calling of PADDLE_ENFORCE.
test=develop

d4832077

21 5月, 2019 1 次提交

Add LAMB Optimizer support (#17489) · f9796b12

由 Yibing Liu 提交于 5月 21, 2019

* Add LAMB optimizer

* Expose LAMB Optimizer's APIs

test=develop, test=document_preview

* Cleanup code & doc

test=develop, test=document_preview

* Update lamb optimizer's formula

test=develop

f9796b12

11 12月, 2018 1 次提交
- Y
  Fix Eigen macro when using GPU · 7604b1ad
  由 Yu Yang 提交于 12月 11, 2018
```
The macro should be defined by compiler rather than by source.

test=develop
```
  7604b1ad
16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

21 11月, 2017 1 次提交
- Y
  Support many data types of several operators (#5731) · a5e73f9e
  由 Yu Yang 提交于 11月 21, 2017
```
* Support many data types of several operators

* SeqConv only support float/double

* Revert adagrad
```
  a5e73f9e
13 10月, 2017 1 次提交

Adding the Adam Optimizer operator (#4733) · 11680037

由 Abhinav Arora 提交于 10月 12, 2017

* add adam op

moment1_out = beta1 * moment1 + (1 − beta1) * grad
moment2_out = beta2 * moment2 + (1 − beta2) * grad * grad
moment1_hat =  moment1_out / (1 - beta1^t)
moment2_hat =  moment2_out / (1 - beta2^t)
param_out = param - learning_rate * moment1_hat / (sqrt(moment2_hat) +
epsilon)

* fix moment 2

* Adding the Adam optimization operator

* Adding more tests for Adam op

11680037

07 8月, 2017 1 次提交
- D
  
  "remove alias to more operators" · 6b23b91c
  由 dongzhihong 提交于 8月 07, 2017
  
  6b23b91c
04 8月, 2017 1 次提交
- L
  
  Add cpplint for *.h and cuda *.cu · b58725bd
  由 liaogang 提交于 8月 04, 2017
  
  b58725bd
31 7月, 2017 1 次提交
- Q
  
  add EIGEN_USE_GPU macro to op.cu file · 61f94f00
  由 qijun 提交于 7月 31, 2017
  
  61f94f00
25 7月, 2017 1 次提交
- Y
  Add type_alias to import framework into ops · efc119b4
  由 Yu Yang 提交于 7月 25, 2017
```
Make implement an operator less noisy.
```
  efc119b4
19 7月, 2017 1 次提交
- Q
  Add sgd op (#2950) · e3b27d19
  由 Qiao Longfei 提交于 7月 19, 2017
```
* a simplest SGD op
```
  e3b27d19

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致