提交 · fc6eed5b2789d5cdb5c84bf2fb9e41db2bcfdc5d · 机器未来 / Paddle

13 1月, 2022 1 次提交

Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b

由 jakpiase 提交于 1月 13, 2022

* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix

fc6eed5b

09 7月, 2019 1 次提交
- P
  
  Add mkldnn int8 mul-op kernel (#17834) · 0caa08ea
  由 Physher 提交于 7月 09, 2019
  
  0caa08ea
14 5月, 2019 1 次提交

support fc_op double grad (#17317) · 60be66e2

由 Kaipeng Deng 提交于 5月 14, 2019

* add double grad for mul_op. test=develop

* fix format. test=develop

* fix format. test=develop

* fix format. test=develop

* refine code. test=develop

* remove setzero. test=develop

* fix dx/dy init bug. test=develop

* fix format. test=develop

60be66e2

22 8月, 2018 1 次提交
- Y
  
  Process elemwise grad op's lod. mul_op's lod · 211d8186
  由 Yu Yang 提交于 8月 22, 2018
  
  211d8186
04 5月, 2018 1 次提交
- Y
  
  Clean and extract blas · ef6ea790
  由 Yu Yang 提交于 5月 04, 2018
  
  ef6ea790
03 5月, 2018 1 次提交
- Y
  
  Clean MatMul · 815d8884
  由 Yu Yang 提交于 5月 03, 2018
  
  815d8884
15 3月, 2018 1 次提交

Add fp16 mul op support and bind paddle fp16 to numpy fp16 (#9017) · e26f1123

由 Kexin Zhao 提交于 3月 14, 2018

* add fp16 mul op support

* small fix

* fix bug

* small fix

* fix PADDLE_WITH_CUDA compiling issue

* reorg code

* test for pybind

* treate as float16 as uint16_t in pybind

* bind np.float16 to paddle float16

* small fix

* clean code

* remove redundancy

* fix mul_op test

* address comments

* small fix

* add is_float16_supported func

e26f1123

12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

09 11月, 2017 1 次提交
- D
  
  remove header file paddle/framework/eigen.h · cceed081
  由 dangqingqing 提交于 11月 09, 2017
  
  cceed081
08 11月, 2017 1 次提交
- D
  
  Remove fill_constant_batch_size_like_op.h and clean some operator codes. · e5791dd1
  由 dangqingqing 提交于 11月 08, 2017
  
  e5791dd1
24 10月, 2017 1 次提交

Correct mul_op implementation (#4988) · bc151174

由 Yu Yang 提交于 10月 23, 2017

* Correct mul_op implementation

* Restore the origin shape after mul

* Fix mul op

* Do not touch math_function

bc151174

20 10月, 2017 1 次提交

Remove template parameter for Tensor methods (#4937) · c532b967

由 Yu Yang 提交于 10月 19, 2017

* Remove template parameter for Tensor methods

* Also check the type is correct when data()
* Simplize holder_

* Fix accuracy_op

* Register Code

c532b967

28 9月, 2017 1 次提交
- Y
  
  Add Skeleton of Double support · 3a5693e0
  由 Yu Yang 提交于 9月 27, 2017
  
  3a5693e0
19 9月, 2017 1 次提交

Remove lazy-initialization in device_context · 81d56ca8

由 Yu Yang 提交于 9月 18, 2017

* Also use `const DeviceContext&` all the time, to prevent `const_cast`

Fix #4169
Fix #3468
Fix #3475

81d56ca8

07 9月, 2017 2 次提交
- F
  
  Follow comments · 5aacd64b
  由 fengjiayi 提交于 9月 06, 2017
  
  5aacd64b
- F
  
  Follow comments · f2a66ffa
  由 fengjiayi 提交于 9月 06, 2017
  
  f2a66ffa
06 9月, 2017 1 次提交
- F
  
  Add global function `FalttenToMatrix` and add `axis` for MulOp · af0264aa
  由 fengjiayi 提交于 9月 05, 2017
  
  af0264aa
05 9月, 2017 1 次提交
- D
  
  revert scatter_op and other mirror changes. · ab55d793
  由 dangqingqing 提交于 9月 05, 2017
  
  ab55d793
04 9月, 2017 1 次提交
- D
  
  Make some operator correctly handle gradients for multi inputs. · 44703329
  由 dangqingqing 提交于 9月 04, 2017
  
  44703329
19 8月, 2017 1 次提交
- D
  
  "tensor mutable data" · 0cf5bdec
  由 dongzhihong 提交于 8月 18, 2017
  
  0cf5bdec
18 8月, 2017 1 次提交
- D
  
  "format style" · 7b4b9d3e
  由 dongzhihong 提交于 8月 17, 2017
  
  7b4b9d3e
14 8月, 2017 2 次提交
- D
  
  "remove unused commented code" · e0395a53
  由 dongzhihong 提交于 8月 14, 2017
  
  e0395a53
- D
  
  "refine argument with new style " · 632b320e
  由 dongzhihong 提交于 8月 14, 2017
  
  632b320e
11 8月, 2017 1 次提交
- Y
  
  Fix python unit tests · c99f84ac
  由 Yu Yang 提交于 8月 11, 2017
  
  c99f84ac
10 8月, 2017 1 次提交
- D
  
  "on hold" · 2ddb1122
  由 dongzhihong 提交于 8月 10, 2017
  
  2ddb1122
09 8月, 2017 1 次提交
- Q
  
  fix gpu build error · 7307b439
  由 qijun 提交于 8月 09, 2017
  
  7307b439
08 8月, 2017 1 次提交
- D
  
  "fix clang format" · 22f03c39
  由 dongzhihong 提交于 8月 08, 2017
  
  22f03c39
07 8月, 2017 2 次提交
- D
  
  "remove alias to more operators" · 6b23b91c
  由 dongzhihong 提交于 8月 07, 2017
  
  6b23b91c
- Q
  
  add global matmul function for Tensor · 97d8175a
  由 qijun 提交于 8月 07, 2017
  
  97d8175a
05 8月, 2017 1 次提交
- Y
  
  Reformat paddle/operators/* strictly following Google Style Guide · 9620df44
  由 Yi Wang 提交于 8月 04, 2017
  
  9620df44
03 8月, 2017 2 次提交
- Q
  
  fix gpu build error · f190a795
  由 qijun 提交于 8月 03, 2017
  
  f190a795
- Q
  
  add gemm for both cpu and gpu · 22dac40c
  由 qijun 提交于 8月 03, 2017
  
  22dac40c
02 8月, 2017 2 次提交
- L
  
  Refine compute code in operators · b36205e2
  由 liaogang 提交于 8月 02, 2017
  
  b36205e2
- Y
  
  Return Reference Instead Pointer to GetEigenDevice · 02655a22
  由 Yu Yang 提交于 8月 02, 2017
  
  02655a22
01 8月, 2017 1 次提交

use operator context and infer context (#3024) · 61ebacbc

由 Qiao Longfei 提交于 8月 01, 2017

* use operator context

* optimize code

* update net infershape

* update InferShape

* disable override InferShape(scope) in OperatorBase

* change InferShapeImpl to InferShape

* add template to OperatorContext Input/Output

* merge Input InputVar, Output OutputVar

* change Inputs to MultiInput

* fix conflict

* fix MultiInput bugs and add unit test

* rename KernelContext to ExecutionContext

* clean code

* change InferShape to protected

* fix template bug

* refine code

* use InputVar instead of Input<Variable>

* typo

* optimize code

61ebacbc

25 7月, 2017 1 次提交
- Y
  Add type_alias to import framework into ops · efc119b4
  由 Yu Yang 提交于 7月 25, 2017
```
Make implement an operator less noisy.
```
  efc119b4

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致