提交 · fcc0452c8b1e3a2687b433c1c851eca8a934d9f5 · BaiXuePrincess / Paddle

07 11月, 2018 1 次提交

Add fp16 backward support (#14202) · a9b5d42d

由 chengduo 提交于 11月 07, 2018

* add fp16 backward support
test=develop

* add sum_op fp16 test

* disable test_dist_save_load
test=develop

* add check_grad for sum

* add unit test for softmax_grad fp16
test=develop

* add scale_op unit test

* add mul_grad_op unit test for fp16

* add cross_entropy_grad and eman_grad unit test for fp16
test=develop

* fix cross_entropy unit test

* add pool2d fp16 unit test

* refine conv2d fp16 unit test
test=develop

* refine activation unit test
test=develop

* fix ci
test=develop

* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop

a9b5d42d

14 10月, 2018 1 次提交
- W
  
  compile in linux · 3ae96450
  由 wanghaoshuang 提交于 10月 14, 2018
  
  3ae96450
20 9月, 2018 1 次提交

Feature/op_fuse_pass (#12440) · d402234b

由 chengduo 提交于 9月 20, 2018

* Add Preface

* Add demo code

* Save file

* Refine code

* seems can work

* use elementwise strategy

* Use ElementwiseComputeEx

* Add comments

* extract functions from operator

* Refine code

* Follow comment

* code refine

* add op_fuse  pass

* add backward

* code refine

* use TopologySortOperations

* follow comments

* refine IsFusible

* code enhance

* fix op_fusion_pass

* refine code

* refine fuse_elemwise_act_op

* adjust the input and output

* refine logic

* add intermediate_edge

* disable inplace

* follow comments

* refine logic

* follow comments

* Remove the removable IntermediateOut

* change strategy

* code refine

* enable fuse backward

* code refine

* code refine

* rename unit test

* follow comments

d402234b

12 9月, 2018 1 次提交
- D
  
  add demo · c3e1fb5a
  由 dzhwinter 提交于 9月 12, 2018
  
  c3e1fb5a
03 9月, 2018 1 次提交
- D
  
  fix elementwise (#13146) · 856c26fa
  由 dzhwinter 提交于 9月 03, 2018
  
  856c26fa
30 8月, 2018 1 次提交

Enhance fused_elementwise_activation_op (#12837) · 3bd1d22a

由 chengduo 提交于 8月 30, 2018

* Enhance the function of fused_elementwise_activation_op

* enhance unit test

* Clean Code And Add Doc

* Add compound functors

* Fix doc and enhance unit test

* define Dx and Dy for d_binary_func

* add mul_scale

* add mul_scale

* add elementwise_mul

* code refine

* code refine

* add doc

* add  AsIntermediate

3bd1d22a

27 8月, 2018 1 次提交
- D
  
  operator module is done · cd8f3e9e
  由 dzhwinter 提交于 8月 27, 2018
  
  cd8f3e9e
20 8月, 2018 1 次提交
- T
  
  fix SEGV elementwise add at debug mode · 0507f7bc
  由 tensor-tang 提交于 8月 20, 2018
  
  0507f7bc
17 8月, 2018 1 次提交
- D
  Revert ""cherry picked operators changes" (#12184)" (#12747) · 4069262f
  由 dzhwinter 提交于 8月 17, 2018
```
This reverts commit bf3c3496.
```
  4069262f
16 8月, 2018 1 次提交

"cherry picked operators changes" (#12184) · bf3c3496

由 dzhwinter 提交于 8月 16, 2018

* "cherry picked operators changes"

* "remove duplicated code"

* "add constant setter"

* "add get expected kernel"

* "fix ci"

* "add fill constant"

bf3c3496

10 8月, 2018 1 次提交
- D
  
  "fix style" (#12600) · 8499559c
  由 dzhwinter 提交于 8月 10, 2018
  
  8499559c
01 8月, 2018 1 次提交

explicit gradient of elementwise_add/elementwise_sub (#11970) · 595a2c83

由 dzhwinter 提交于 8月 01, 2018

* "add gradient register"

* "make some enhance"

* "better format"

* "fix typo"

* "fix reuse"

* "fix get expected kernel"

* "change the mkldnn code"

* "fix mkldnn"

* "fix mkldnn failed test"

* "add comment"

595a2c83

03 5月, 2018 1 次提交
- C
  Fix __shfl_down_sync_ of cross_entropy (#10345) · 4fbde42c
  由 chengduo 提交于 5月 03, 2018
```
* fix __shfl_down_sync_ of cross_entropy

* use reduceSum

* "fix ci"
```
  4fbde42c
30 4月, 2018 1 次提交
- D
  Feature/cuda9 cudnn7 (#10140) · eb6f9dd5
  由 dzhwinter 提交于 4月 30, 2018
```
* "re-commit "

* "picked up"

* "fix ci"

* "fix pdb hang up issue in cuda 9"
```
  eb6f9dd5
24 4月, 2018 1 次提交
- C
  
  fix elementwise_grad op kernel and add unit test · d06c79c7
  由 chengduoZH 提交于 4月 24, 2018
  
  d06c79c7
10 4月, 2018 1 次提交
- C
  Move reduceSum to elementwise_op_function.h (#9773) · b1224da8
  由 chengduo 提交于 4月 10, 2018
```
* add cuda_device_functions.h

* move reduceSum to elementwise_op_function.h
```
  b1224da8
06 3月, 2018 1 次提交
- C
  
  refine elementwise_mul_op · a1331f98
  由 chengduoZH 提交于 3月 06, 2018
  
  a1331f98
28 2月, 2018 1 次提交

Correctly handling variable with batch dimension for math ops. · e9b8ebf4

由 xuwei06 提交于 2月 22, 2018

When the second argument contains batch dimension, the axis should be 0.

Also makes elementwise ops more tolerant at handling tensors with trailing
singular dimensions.

e9b8ebf4

26 2月, 2018 1 次提交
- C
  
  refine Sum · b8938b44
  由 chengduoZH 提交于 2月 24, 2018
  
  b8938b44
24 2月, 2018 2 次提交
- C
  
  follow comments · a8288392
  由 chengduoZH 提交于 2月 24, 2018
  
  a8288392
- C
  
  refine Sum · 22b9ab05
  由 chengduoZH 提交于 2月 24, 2018
  
  22b9ab05
23 2月, 2018 2 次提交
- C
  
  fix get_mid_dims annotation (#8490) · 0e187bc9
  由 chengduo 提交于 2月 23, 2018
  
  0e187bc9
- Y
  Speed up elemwise grad (#8402) · 88c22e9d
  由 Yu Yang 提交于 2月 23, 2018
```
* Speed up elemwise grad

* Fix bug

* Add macro for MAX_BLOCK_DIM
```
  88c22e9d
13 2月, 2018 1 次提交
- X
  Make print_op able to show the value of bool tensor · 004df46f
  由 xuwei06 提交于 2月 12, 2018
```
And some minor fixes on comments.
```
  004df46f
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
03 2月, 2018 1 次提交
- C
  
  Add layer norm [GPU] · 76e188e5
  由 chengduoZH 提交于 2月 02, 2018
  
  76e188e5
02 2月, 2018 1 次提交
- C
  
  refine elementwise_op · affce733
  由 chengduoZH 提交于 2月 02, 2018
  
  affce733
22 1月, 2018 1 次提交
- Y
  
  Fix CI · 2024489b
  由 Yang Yu 提交于 1月 22, 2018
  
  2024489b
19 1月, 2018 1 次提交
- Y
  
  Make compare_op reuse elemwise_op_funcs · 9c0b2901
  由 Yang Yu 提交于 1月 19, 2018
  
  9c0b2901
17 1月, 2018 1 次提交
- F
  
  make elementwise op support scalar as input Y · 14f6fa34
  由 fengjiayi 提交于 1月 17, 2018
  
  14f6fa34
16 1月, 2018 1 次提交
- F
  
  Refine code · ead7059b
  由 fengjiayi 提交于 1月 16, 2018
  
  ead7059b
15 1月, 2018 1 次提交
- F
  
  remove unnecessary functor1 · 6ee8a2e1
  由 fengjiayi 提交于 1月 15, 2018
  
  6ee8a2e1
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
25 12月, 2017 1 次提交
- C
  
  refine iterator · bcf0b56f
  由 chengduoZH 提交于 12月 23, 2017
  
  bcf0b56f
19 12月, 2017 2 次提交
- C
  
  refine · f1ab13bd
  由 chengduoZH 提交于 12月 19, 2017
  
  f1ab13bd
- C
  
  refine elementwise · 5e04b64f
  由 chengduoZH 提交于 12月 19, 2017
  
  5e04b64f
16 12月, 2017 1 次提交
- C
  
  refine cos-sim-op · 784740d8
  由 chengduoZH 提交于 12月 11, 2017
  
  784740d8
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致