提交 · be817719982f1821ab0519ceab85ec238bf99d43 · Crayon鑫 / Paddle

11 1月, 2022 1 次提交

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

29 10月, 2021 1 次提交

add new API/OP: paddle.linalg.triangular_solve (#36714) · 92d6a048

由 zhouweiwei2014 提交于 10月 29, 2021

* add new API: paddle.linalg.triangular_solve

* add new API/OP: paddle.linalg.triangular_solve

* add new API/OP: paddle.linalg.triangular_solve

* fix comment

92d6a048

24 9月, 2021 1 次提交

Add paddle.linalg.solve OP (#35715) · 8caf951c

由 Weilong Wu 提交于 9月 24, 2021

* Add linalg.solve op, test=develop

* Fix a bug caused by accidental deletion

* updated description and fix a bug: missing a comma

* Add linalg.solve op, test=develop

* updated solve op backward logic

* updated solve op backward logic again

* Add linalg.solve Op, test=develop

* Updated and modified to fit CI requirements

* Fix a bug

* 1)Add more test cases; 2)Fix a wrong usage in reduces operation; 3)Remove redundant code

* Remove redundant comments

* 1)Removed redundant code; 2)Updated to enhance code robustness

* Removed redundant code

* Updated API documents

8caf951c

03 3月, 2021 1 次提交
- Q
  [ROCM] update fluid operators for rocm (part3), test=develop (#31213) · 84639b61
  由 Qi Li 提交于 3月 03, 2021
```
* [ROCM] update fluid operators for rocm (part3), test=develop

* fix clang format error, test=develop
```
  84639b61
03 11月, 2020 1 次提交
- W
  
  Paddle support compile on sw (#27858) · 09fd2b2a
  由 Wilber 提交于 11月 03, 2020
  
  09fd2b2a
24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

22 8月, 2020 1 次提交
- S
  Add Matmul op (#26411) · c6090660
  由 ShenLiang 提交于 8月 22, 2020
```
* add matmul_v2
```
  c6090660
27 4月, 2020 1 次提交
- Y
  
  Add the implementation of inverse (#23310) · ecfddebb
  由 Yiqun Liu 提交于 4月 27, 2020
  
  ecfddebb
24 4月, 2020 1 次提交

Add cholesky_op (#23543) · a8c0fb4e

由 Guo Sheng 提交于 4月 24, 2020

* Add cholesky_op forward part. test=develop

* Complete cholesky_op forward part. test=develop

* Add cholesky_op backward part. test=develop

* Complete cholesky_op backward part. test=develop

* Refine cholesky_op error check and docs. test=develop

* Add grad_check unit test for cholesky_op. test=develop

* Fix sample code in cholesky doc. test=develop

* Refine some error messages of cholesky_op. test=develop

* Refine some error messages of cholesky_op. test=develop

* Remove unused input in cholesky_grad. test=develop

* Remove unused input in cholesky_grad. test=develop

* Fix stream for cusolverDnSetStream. test=develop

* Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code.
test=develop

* Add CUSOLVER ERROR in enforce.h
test=develop

* Fix the missing return value in cholesky. test=develop

a8c0fb4e

30 9月, 2019 1 次提交
- D
  Improve elementwise operators performance in same dimensions. (#19763) · 425279a5
  由 danleifeng 提交于 9月 30, 2019
```
Improve elementwise operators performance in same dimensions
```
  425279a5
25 9月, 2019 1 次提交

add support of matmul with multiple head even different width and height (#19708) · c670058a

由 Bob Zhu 提交于 9月 25, 2019

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* refactor the code of matmul with multiple head even different width and height

test=develop

c670058a

20 8月, 2019 1 次提交

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

24 7月, 2019 1 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

04 3月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8
26 2月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · 73967886
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  73967886
22 2月, 2019 2 次提交

T
Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
由 tensor-tang 提交于 2月 22, 2019
```
* Revert "Optimze Gelu with MKL Erf function (#15770)"

This reverts commit 676995c8.

* test=develop
```
ee2321de

Optimze Gelu with MKL Erf function (#15770) · 676995c8

由 Yihua Xu 提交于 2月 22, 2019

* Optimize for gelu operator

* Set up the low accuracy mode of MKL ERF function.

test=develop

* Only enable MKLML ERF when OS is linux

* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.

test=develop

* Add the CUDA macro to avoid NVCC's compile issue.

test=develop

* Add the TODO comments for mklml library modification.

test=develop

* Clean Code

test=develop

* Add the comment of marco for NVCC compiler.

test=develop

676995c8

13 12月, 2018 1 次提交
- Y
  
  Use mkl · 7b10bf0e
  由 Yu Yang 提交于 12月 13, 2018
  
  7b10bf0e
27 11月, 2018 1 次提交
- J
  
  - ASUM MKL integration · 8bfa1fa9
  由 Jacek Czaja 提交于 11月 27, 2018
  
  8bfa1fa9
16 11月, 2018 1 次提交
- T
  fix lrn on mac (#14426) · 64f7516a
  由 tensor-tang 提交于 11月 16, 2018
```
* rename and fix blas vsqr

test=develop

* update
```
  64f7516a
13 11月, 2018 1 次提交
- T
  
  add mkl vsqr and vpow · 1be85d01
  由 tensor-tang 提交于 11月 13, 2018
  
  1be85d01
22 8月, 2018 5 次提交
- T
  
  fix bugs · cf5ea925
  由 tensor-tang 提交于 8月 22, 2018
  
  cf5ea925
- T
  
  add blas vexp · 3dd66390
  由 tensor-tang 提交于 8月 22, 2018
  
  3dd66390
- T
  
  fix blas dot and add cblas scal · 0ec1f65c
  由 tensor-tang 提交于 8月 22, 2018
  
  0ec1f65c
- T
  
  add cblas dot · a2203d04
  由 tensor-tang 提交于 8月 22, 2018
  
  a2203d04
- T
  
  refine blas gemm · f72ab896
  由 tensor-tang 提交于 8月 22, 2018
  
  f72ab896
16 8月, 2018 1 次提交
- T
  
  add mklml vmul · 6644ce79
  由 tensor-tang 提交于 8月 16, 2018
  
  6644ce79
06 8月, 2018 1 次提交
- T
  
  fix blas · 54c95e49
  由 tensor-tang 提交于 8月 06, 2018
  
  54c95e49
03 8月, 2018 2 次提交
- T
  
  fix blas and use packed weight · 8c23f7c4
  由 tensor-tang 提交于 8月 03, 2018
  
  8c23f7c4
- T
  
  add mkl packed gemm · 43cee33a
  由 tensor-tang 提交于 8月 02, 2018
  
  43cee33a
05 7月, 2018 2 次提交
- T
  
  link libxsmm · 17987eb3
  由 tensor-tang 提交于 7月 05, 2018
  
  17987eb3
- D
  
  "remove lapack" (#11966) · 99a99ec7
  由 dzhwinter 提交于 7月 05, 2018
  
  99a99ec7
27 6月, 2018 1 次提交
- T
  
  move SetNumThreads to platform · e3a96300
  由 tensor-tang 提交于 6月 27, 2018
  
  e3a96300
20 6月, 2018 1 次提交
- T
  
  enable dynamic load mklml lib on fluid · f503f129
  由 tensor-tang 提交于 6月 20, 2018
  
  f503f129
24 5月, 2018 1 次提交
- T
  
  MKL elementwise add: elementwise_add uses vAdd VML function when MKL is used · e43c8f33
  由 Tomasz Patejko 提交于 5月 17, 2018
  
  e43c8f33
21 5月, 2018 1 次提交
- L
  Add an interface to set the number of threads for math function, and set the... · 39eb871d
  由 Liu Yiqun 提交于 5月 21, 2018
```
Add an interface to set the number of threads for math function, and set the default value to 1 for inference.
```
  39eb871d
08 5月, 2018 2 次提交
- Y
  
  Follow comments and polish code names · fcd31d61
  由 Yu Yang 提交于 5月 08, 2018
  
  fcd31d61
- Y
  Move MatMul to blas_impl.h · 0a13d3c6
  由 Yu Yang 提交于 5月 08, 2018
```
Rename MatDim to MatDescriptor
```
  0a13d3c6
07 5月, 2018 1 次提交
- Y
  
  Rewrite Matmul, make code cleaner · c6a6d87f
  由 Yu Yang 提交于 5月 07, 2018
  
  c6a6d87f
04 5月, 2018 1 次提交
- Y
  
  Clean and extract blas · ef6ea790
  由 Yu Yang 提交于 5月 04, 2018
  
  ef6ea790

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致