提交 · e8d6ff5096acd1066658e819537ba023602bcf0f · 机器未来 / Paddle

25 5月, 2021 1 次提交

modify complex template for elementwise ops (#33071) · dbc08d69

由 chentianyu03 提交于 5月 25, 2021

* modify complex template for elementwise ops

* modify mul, div grad struct

* add complex template for CudaShuffleDownSync CudaShuffleXorSync funcs and fix the bug when delete cuda<9000

* fix shuffle func args bug

* fix shuffle func args bug

* fix shuffle func args bug

dbc08d69

20 5月, 2021 1 次提交
- L
  
  Binary functor envoking of elementwise broadcast (#32928) · 14949521
  由 limingshu 提交于 5月 20, 2021
  
  14949521
14 5月, 2021 1 次提交
- L
  
  Optimization the broadcast performance of elementwise_add (#32512) · b035c8b0
  由 limingshu 提交于 5月 14, 2021
  
  b035c8b0
10 5月, 2021 1 次提交
- Z
  
  Support different data type between input and output (#32823) · 3419de53
  由 Zhang Zheng 提交于 5月 10, 2021
  
  3419de53
22 4月, 2021 1 次提交
- Z
  
  Modify some contents for elementwise op impl (#32414) · 890d6bc0
  由 Zhang Zheng 提交于 4月 22, 2021
  
  890d6bc0
18 4月, 2021 1 次提交
- Z
  
  Unify the implementation of elementwise operation of same dimensions (#32148) · 2c182583
  由 Zhang Zheng 提交于 4月 18, 2021
  
  2c182583
03 4月, 2021 1 次提交
- J
  
  Optimize elementwise_add_grad op, test=develop (#32051) · 1e52f324
  由 jiangcheng 提交于 4月 03, 2021
  
  1e52f324
01 4月, 2021 1 次提交
- Z
  
  Optimize the perf of SameDimsAdd CUDA Kernel (#31872) · 4acc87be
  由 Zhang Zheng 提交于 4月 01, 2021
  
  4acc87be
08 12月, 2020 1 次提交
- Z
  Revert "improve elementwise_add_grad perf (#29277)" (#29464) · 560b4323
  由 Zhang Ting 提交于 12月 08, 2020
```
This reverts commit befd6d53.
```
  560b4323
03 12月, 2020 1 次提交
- Z
  improve elementwise_add_grad perf (#29277) · befd6d53
  由 Zhang Ting 提交于 12月 03, 2020
```
* improve performance of elementwise_sum_grad
```
  befd6d53
01 12月, 2020 1 次提交

add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199) · 8f45d142

由 chentianyu03 提交于 12月 01, 2020

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

8f45d142

21 9月, 2020 1 次提交

[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112) · aba759ba

由 Leo Chen 提交于 9月 21, 2020

* support use add instead of sum to do gradient accumulation

* add inplace addto pass

* add grad_add op and inplace addto pass

* remove debug code

* code refine

* fix bug when sereral sum ops inserts at same op_idx

* fix Flags type

* add addto attribute for conv3d

* fix ut

* code clean

* fix type

aba759ba

24 10月, 2019 1 次提交
- D
  
  fix fp16 grid_size for size=1; test=develop (#20812) · 9171f737
  由 danleifeng 提交于 10月 24, 2019
  
  9171f737
30 9月, 2019 1 次提交
- D
  Improve elementwise operators performance in same dimensions. (#19763) · 425279a5
  由 danleifeng 提交于 9月 30, 2019
```
Improve elementwise operators performance in same dimensions
```
  425279a5
18 9月, 2019 1 次提交

Update elementwise double grad to save gpu memory (#19509) · 982e61f5

由 Leo Chen 提交于 9月 18, 2019

* update elementwise double grad to save gpu memory, test=develop

* update elementwise_mul/div_grad_grad to save memory, test=develop

* remove eval function in eigen statement to save memory, test=develop

* add unittest for elementwise_div_grad_grad without dout, test=develop

* add unittest for elementwise_add_grad_grad without ddx, test=develop

* add float16 cuda kernel for elementwise double grad op, test=develop

982e61f5

14 5月, 2019 1 次提交
- K
  add elementwise_add_grad_grad op (#17366) · bd9bef5a
  由 Kaipeng Deng 提交于 5月 14, 2019
```
* add elementwise_add_grad_grad op. test=develop

* use defined GradMaker. test=develop
```
  bd9bef5a
13 5月, 2019 1 次提交

Optimize the elementwise op using eigen (#15494) · dcda2023

由 Yiqun Liu 提交于 5月 13, 2019

* Optimize the elementwise op with CUDA kernels.
test=develop

* Support setting of attr in op config file.
test=develop

* Add the support the setting dtype and initializer in config.
test=develop

* Save workspace.

* Add initializer "zeros".
test=develop

* Fix compiling error.

* Support the use of existed file to initailize tensor in op_tester.

* Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
test=develop

dcda2023

11 12月, 2018 1 次提交
- Y
  Fix Eigen macro when using GPU · 7604b1ad
  由 Yu Yang 提交于 12月 11, 2018
```
The macro should be defined by compiler rather than by source.

test=develop
```
  7604b1ad
16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

07 11月, 2018 1 次提交

Add fp16 backward support (#14202) · a9b5d42d

由 chengduo 提交于 11月 07, 2018

* add fp16 backward support
test=develop

* add sum_op fp16 test

* disable test_dist_save_load
test=develop

* add check_grad for sum

* add unit test for softmax_grad fp16
test=develop

* add scale_op unit test

* add mul_grad_op unit test for fp16

* add cross_entropy_grad and eman_grad unit test for fp16
test=develop

* fix cross_entropy unit test

* add pool2d fp16 unit test

* refine conv2d fp16 unit test
test=develop

* refine activation unit test
test=develop

* fix ci
test=develop

* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop

a9b5d42d

17 8月, 2018 1 次提交
- D
  Revert ""cherry picked operators changes" (#12184)" (#12747) · 4069262f
  由 dzhwinter 提交于 8月 17, 2018
```
This reverts commit bf3c3496.
```
  4069262f
16 8月, 2018 1 次提交

"cherry picked operators changes" (#12184) · bf3c3496

由 dzhwinter 提交于 8月 16, 2018

* "cherry picked operators changes"

* "remove duplicated code"

* "add constant setter"

* "add get expected kernel"

* "fix ci"

* "add fill constant"

bf3c3496

14 8月, 2018 1 次提交
- T
  
  Revert "Refine elementwise_add op" · 6a2a9a83
  由 tensor-tang 提交于 8月 14, 2018
  
  6a2a9a83
06 8月, 2018 1 次提交
- S
  
  refine elementwise_add op · b2d0ee51
  由 sneaxiy 提交于 8月 06, 2018
  
  b2d0ee51
20 3月, 2018 2 次提交
- K
  
  rearrange test · 3da094fd
  由 Kexin Zhao 提交于 3月 19, 2018
  
  3da094fd
- K
  
  add fp16 kernel for elementwise add · 4bf168b2
  由 Kexin Zhao 提交于 3月 19, 2018
  
  4bf168b2
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

15 11月, 2017 1 次提交
- D
  
  "fix gpu related op registered" (#5647) · 7c3ec220
  由 dzhwinter 提交于 11月 14, 2017
  
  7c3ec220
22 9月, 2017 1 次提交
- G
  Elementwise operator. (#4139) · f99841dd
  由 gongweibao 提交于 9月 22, 2017
```
Elementwise operator add/sub/mul/div
```
  f99841dd
13 9月, 2017 1 次提交
- G
  Add element-wise multiplication operator. (#3787) · 8778957c
  由 gongweibao 提交于 9月 13, 2017
```
Add element-wise multiplication operator
```
  8778957c
24 8月, 2017 1 次提交
- Q
  
  register rowwise add gpu kernel · 12864f14
  由 qiaolongfei 提交于 8月 23, 2017
  
  12864f14
07 8月, 2017 1 次提交
- D
  
  "remove a lot alias" · 610801b5
  由 dongzhihong 提交于 8月 07, 2017
  
  610801b5
04 8月, 2017 2 次提交
- L
  
  Add cpplint for *.h and cuda *.cu · b58725bd
  由 liaogang 提交于 8月 04, 2017
  
  b58725bd
- D
  
  fix op name · 8ff3590e
  由 dongzhihong 提交于 8月 04, 2017
  
  8ff3590e
31 7月, 2017 1 次提交
- Q
  
  add EIGEN_USE_GPU macro to op.cu file · 61f94f00
  由 qijun 提交于 7月 31, 2017
  
  61f94f00
25 7月, 2017 1 次提交
- Y
  Add type_alias to import framework into ops · efc119b4
  由 Yu Yang 提交于 7月 25, 2017
```
Make implement an operator less noisy.
```
  efc119b4

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致