提交 · a4bccde08c6a9635f483dc805b8c940be8835446 · Crayon鑫 / Paddle

01 3月, 2022 1 次提交

[bf16] add bf16 kernel: layer_norm p_norm reduce_sum (#39843) · ce8ed978

由 zhangbo9674 提交于 3月 01, 2022

* add layer norm

* add p norm

* add reduce sum

* refine layer norm register bf16 for cudnn811

* add bf16 cast for hip

* add unittest

* refine rocm

* refine layer_norm unittest

* refine reduce op

* refine unittest

* enhance atol for reduce unittest

ce8ed978

09 2月, 2022 1 次提交
- N
  
  Replace EigenBroadcast with ElementwiseBroadcast in ReduceGrad (#39255) · 772be4f5
  由 niuliling123 提交于 2月 09, 2022
  
  772be4f5
25 1月, 2022 2 次提交
- N
  Revert "Replace EigenBroadcast with ElementwiseBroadcast in ReduceGrad (#38959)" (#39205) · 978558be
  由 niuliling123 提交于 1月 25, 2022
```
This reverts commit 9059ef69.
```
  978558be
- N
  
  Replace EigenBroadcast with ElementwiseBroadcast in ReduceGrad (#38959) · 9059ef69
  由 niuliling123 提交于 1月 25, 2022
  
  9059ef69
17 12月, 2021 1 次提交
- N
  
  Delete cub_reduce.h and modified the TensorReduce to TensorReduceFunctorImpl (#38197) · 9a8a4c77
  由 niuliling123 提交于 12月 17, 2021
  
  9a8a4c77
15 6月, 2021 1 次提交

Support reduce_sum_op float16 (#32966) · 606939de

由 jiangcheng 提交于 6月 15, 2021

* add reduce_sum_op by add self-kernel

* set all ReduceKernel MPType for accuracy

* add float16 test script which input is integer number

* solve reduce sum float16 check_grad problem

* solve conflict and change test script for CI

* change kernel register for CI

* remove all useless template

606939de

28 5月, 2021 1 次提交

modify to complex template types for fill_constant op (#33179) · 1187c610

由 chentianyu03 提交于 5月 28, 2021

* modify to complex template types for fill_constant op

* modify to complex template types for py_layer, strided_slice and reduce_sum_op.part

1187c610

18 5月, 2021 1 次提交
- L
  
  add unit8 for concat (#32850) · 53580bb4
  由 liuyuhui 提交于 5月 18, 2021
  
  53580bb4
25 12月, 2020 1 次提交

[Complex] Add support for complex grad accumulated (#29889) · 1a304e6c

由 Chen Weihang 提交于 12月 25, 2020

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

1a304e6c

05 9月, 2019 1 次提交
- L
  
  update reduce_sum and reduce_mean to save memory, test=develop (#19608) · af692c91
  由 Leo Chen 提交于 9月 05, 2019
  
  af692c91
16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

12 11月, 2018 1 次提交
- Y
  perf(compile): speed up reduce_op compile by splitting files (#14294) · 8f9bfad2
  由 Yu Yang 提交于 11月 12, 2018
```
test=develop
```
  8f9bfad2
06 6月, 2018 3 次提交
- F
  
  Refine code · 41ced8e2
  由 fengjiayi 提交于 6月 06, 2018
  
  41ced8e2
- F
  
  Add unit tests · aa9383f3
  由 fengjiayi 提交于 6月 06, 2018
  
  aa9383f3
- F
  
  complete C++ part · e2bb4d07
  由 fengjiayi 提交于 6月 06, 2018
  
  e2bb4d07

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致