提交 · be817719982f1821ab0519ceab85ec238bf99d43 · Crayon鑫 / Paddle

11 1月, 2022 1 次提交

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

29 10月, 2021 1 次提交

add new API/OP: paddle.linalg.triangular_solve (#36714) · 92d6a048

由 zhouweiwei2014 提交于 10月 29, 2021

* add new API: paddle.linalg.triangular_solve

* add new API/OP: paddle.linalg.triangular_solve

* add new API/OP: paddle.linalg.triangular_solve

* fix comment

92d6a048

12 8月, 2021 1 次提交

Fix safety-bug of functional.linear (#34696) · 0e28c8bb

由 zhulei 提交于 8月 12, 2021

* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear

* Fix safety-bug of functional.linear

0e28c8bb

22 7月, 2021 1 次提交
- C
  Add int16 kernel for lookup_talbe and dequantize_abs_max op (#34275) · 85e531a9
  由 cc 提交于 7月 22, 2021
```
* add int16 kernel for lookup_talbe and dequantize_abs_max op
```
  85e531a9
26 5月, 2021 1 次提交
- C
  modify matmul Op to complex template types (#33130) · 6c07cd7e
  由 chentianyu03 提交于 5月 26, 2021
```
* modify matmul Op to complex template types

* remove complex64/128 head file
```
  6c07cd7e
06 5月, 2021 1 次提交
- A
  
  Sum kernel for CPU supporting BF16 and SelectedRows (#32631) · 9599c3b3
  由 Adam Osewski 提交于 5月 06, 2021
  
  9599c3b3
19 3月, 2021 1 次提交
- A
  
  [oneDNN] lookup_table op with support for BF16 data type. (#31558) · a4a2b77d
  由 Adam Osewski 提交于 3月 19, 2021
  
  a4a2b77d
03 3月, 2021 1 次提交
- Q
  [ROCM] update fluid operators for rocm (part3), test=develop (#31213) · 84639b61
  由 Qi Li 提交于 3月 03, 2021
```
* [ROCM] update fluid operators for rocm (part3), test=develop

* fix clang format error, test=develop
```
  84639b61
25 12月, 2020 1 次提交

[Complex] Add support for complex grad accumulated (#29889) · 1a304e6c

由 Chen Weihang 提交于 12月 25, 2020

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

1a304e6c

01 12月, 2020 1 次提交

add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199) · 8f45d142

由 chentianyu03 提交于 12月 01, 2020

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

8f45d142

24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

18 9月, 2020 1 次提交
- W
  fix the error message for the math dir · b6a4349d
  由 wawltor 提交于 9月 18, 2020
```
https://github.com/PaddlePaddle/Paddle/pull/27332
```
  b6a4349d
22 8月, 2020 1 次提交
- S
  Add Matmul op (#26411) · c6090660
  由 ShenLiang 提交于 8月 22, 2020
```
* add matmul_v2
```
  c6090660
24 4月, 2020 1 次提交

Add cholesky_op (#23543) · a8c0fb4e

由 Guo Sheng 提交于 4月 24, 2020

* Add cholesky_op forward part. test=develop

* Complete cholesky_op forward part. test=develop

* Add cholesky_op backward part. test=develop

* Complete cholesky_op backward part. test=develop

* Refine cholesky_op error check and docs. test=develop

* Add grad_check unit test for cholesky_op. test=develop

* Fix sample code in cholesky doc. test=develop

* Refine some error messages of cholesky_op. test=develop

* Refine some error messages of cholesky_op. test=develop

* Remove unused input in cholesky_grad. test=develop

* Remove unused input in cholesky_grad. test=develop

* Fix stream for cusolverDnSetStream. test=develop

* Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code.
test=develop

* Add CUSOLVER ERROR in enforce.h
test=develop

* Fix the missing return value in cholesky. test=develop

a8c0fb4e

28 2月, 2020 1 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
22 11月, 2019 1 次提交

add dequantize_abs_max op and modify lookup_table op (#20899) · f0b15184

由 Liufang Sang 提交于 11月 22, 2019

* add int8 kernel to lookup_table op and add dequantize op test=develop

* change paddle_enforce to paddle_enforce_eq test=develop

* change copyright and change some not suitable code test=develop

* remove debug log test=develop

* replace GetInputType with IndicateVarDataType test=develop

* fix EmptyGradMaker test=develop

* fix diff between cpu and gpu test=develop

* use memcopy when int8_t test=develop

f0b15184

30 9月, 2019 1 次提交
- D
  Improve elementwise operators performance in same dimensions. (#19763) · 425279a5
  由 danleifeng 提交于 9月 30, 2019
```
Improve elementwise operators performance in same dimensions
```
  425279a5
25 9月, 2019 1 次提交

add support of matmul with multiple head even different width and height (#19708) · c670058a

由 Bob Zhu 提交于 9月 25, 2019

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* refactor the code of matmul with multiple head even different width and height

test=develop

c670058a

04 9月, 2019 1 次提交
- T
  refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607) · 0a46d345
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  0a46d345
02 9月, 2019 1 次提交
- Z
  
  fix the compilation issue on windows caused by mkl_CSRMM (#19533) · 84c72801
  由 zhouwei25 提交于 9月 02, 2019
  
  84c72801
20 8月, 2019 1 次提交

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

24 7月, 2019 1 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

28 6月, 2019 1 次提交
- Z
  Add a unittest to inplace elementwise_add (#18385) · f5641000
  由 Zeng Jinle 提交于 6月 28, 2019
```
* add_elementwise_add_inplace_test,test=develop

* rename file, test=develop
```
  f5641000
04 3月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8
26 2月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · 73967886
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  73967886
22 2月, 2019 2 次提交

T
Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
由 tensor-tang 提交于 2月 22, 2019
```
* Revert "Optimze Gelu with MKL Erf function (#15770)"

This reverts commit 676995c8.

* test=develop
```
ee2321de

Optimze Gelu with MKL Erf function (#15770) · 676995c8

由 Yihua Xu 提交于 2月 22, 2019

* Optimize for gelu operator

* Set up the low accuracy mode of MKL ERF function.

test=develop

* Only enable MKLML ERF when OS is linux

* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.

test=develop

* Add the CUDA macro to avoid NVCC's compile issue.

test=develop

* Add the TODO comments for mklml library modification.

test=develop

* Clean Code

test=develop

* Add the comment of marco for NVCC compiler.

test=develop

676995c8

13 12月, 2018 1 次提交
- Y
  
  Use mkl · 7b10bf0e
  由 Yu Yang 提交于 12月 13, 2018
  
  7b10bf0e
28 11月, 2018 1 次提交
- J
  - Coding style fixes · 48e1b97e
  由 Jacek Czaja 提交于 11月 28, 2018
```
test=develop
```
  48e1b97e
27 11月, 2018 2 次提交
- J
  
  - Building fix to softmax for inference · cf40daee
  由 Jacek Czaja 提交于 11月 27, 2018
  
  cf40daee
- J
  
  - ASUM MKL integration · 8bfa1fa9
  由 Jacek Czaja 提交于 11月 27, 2018
  
  8bfa1fa9
16 11月, 2018 1 次提交
- T
  fix lrn on mac (#14426) · 64f7516a
  由 tensor-tang 提交于 11月 16, 2018
```
* rename and fix blas vsqr

test=develop

* update
```
  64f7516a
13 11月, 2018 1 次提交
- T
  
  add mkl vsqr and vpow · 1be85d01
  由 tensor-tang 提交于 11月 13, 2018
  
  1be85d01
22 8月, 2018 5 次提交
- T
  
  fix bugs · cf5ea925
  由 tensor-tang 提交于 8月 22, 2018
  
  cf5ea925
- T
  
  add blas vexp · 3dd66390
  由 tensor-tang 提交于 8月 22, 2018
  
  3dd66390
- T
  
  fix blas dot and add cblas scal · 0ec1f65c
  由 tensor-tang 提交于 8月 22, 2018
  
  0ec1f65c
- T
  
  add cblas dot · a2203d04
  由 tensor-tang 提交于 8月 22, 2018
  
  a2203d04
- T
  
  refine blas gemm · f72ab896
  由 tensor-tang 提交于 8月 22, 2018
  
  f72ab896
16 8月, 2018 1 次提交
- T
  
  add mklml vmul · 6644ce79
  由 tensor-tang 提交于 8月 16, 2018
  
  6644ce79
06 8月, 2018 1 次提交
- T
  
  fix blas · 54c95e49
  由 tensor-tang 提交于 8月 06, 2018
  
  54c95e49

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致