提交 · a6ce2306f92ca575406adb3f23c5bf661a9a06a3 · Crayon鑫 / Paddle

30 9月, 2019 1 次提交
- D
  Improve elementwise operators performance in same dimensions. (#19763) · 425279a5
  由 danleifeng 提交于 9月 30, 2019
```
Improve elementwise operators performance in same dimensions
```
  425279a5
25 9月, 2019 1 次提交

add support of matmul with multiple head even different width and height (#19708) · c670058a

由 Bob Zhu 提交于 9月 25, 2019

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* refactor the code of matmul with multiple head even different width and height

test=develop

c670058a

04 9月, 2019 1 次提交
- T
  refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607) · 0a46d345
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  0a46d345
02 9月, 2019 1 次提交
- Z
  
  fix the compilation issue on windows caused by mkl_CSRMM (#19533) · 84c72801
  由 zhouwei25 提交于 9月 02, 2019
  
  84c72801
20 8月, 2019 1 次提交

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

24 7月, 2019 1 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

28 6月, 2019 1 次提交
- Z
  Add a unittest to inplace elementwise_add (#18385) · f5641000
  由 Zeng Jinle 提交于 6月 28, 2019
```
* add_elementwise_add_inplace_test,test=develop

* rename file, test=develop
```
  f5641000
04 3月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8
26 2月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · 73967886
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  73967886
22 2月, 2019 2 次提交

T
Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
由 tensor-tang 提交于 2月 22, 2019
```
* Revert "Optimze Gelu with MKL Erf function (#15770)"

This reverts commit 676995c8.

* test=develop
```
ee2321de

Optimze Gelu with MKL Erf function (#15770) · 676995c8

由 Yihua Xu 提交于 2月 22, 2019

* Optimize for gelu operator

* Set up the low accuracy mode of MKL ERF function.

test=develop

* Only enable MKLML ERF when OS is linux

* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.

test=develop

* Add the CUDA macro to avoid NVCC's compile issue.

test=develop

* Add the TODO comments for mklml library modification.

test=develop

* Clean Code

test=develop

* Add the comment of marco for NVCC compiler.

test=develop

676995c8

13 12月, 2018 1 次提交
- Y
  
  Use mkl · 7b10bf0e
  由 Yu Yang 提交于 12月 13, 2018
  
  7b10bf0e
28 11月, 2018 1 次提交
- J
  - Coding style fixes · 48e1b97e
  由 Jacek Czaja 提交于 11月 28, 2018
```
test=develop
```
  48e1b97e
27 11月, 2018 2 次提交
- J
  
  - Building fix to softmax for inference · cf40daee
  由 Jacek Czaja 提交于 11月 27, 2018
  
  cf40daee
- J
  
  - ASUM MKL integration · 8bfa1fa9
  由 Jacek Czaja 提交于 11月 27, 2018
  
  8bfa1fa9
16 11月, 2018 1 次提交
- T
  fix lrn on mac (#14426) · 64f7516a
  由 tensor-tang 提交于 11月 16, 2018
```
* rename and fix blas vsqr

test=develop

* update
```
  64f7516a
13 11月, 2018 1 次提交
- T
  
  add mkl vsqr and vpow · 1be85d01
  由 tensor-tang 提交于 11月 13, 2018
  
  1be85d01
22 8月, 2018 5 次提交
- T
  
  fix bugs · cf5ea925
  由 tensor-tang 提交于 8月 22, 2018
  
  cf5ea925
- T
  
  add blas vexp · 3dd66390
  由 tensor-tang 提交于 8月 22, 2018
  
  3dd66390
- T
  
  fix blas dot and add cblas scal · 0ec1f65c
  由 tensor-tang 提交于 8月 22, 2018
  
  0ec1f65c
- T
  
  add cblas dot · a2203d04
  由 tensor-tang 提交于 8月 22, 2018
  
  a2203d04
- T
  
  refine blas gemm · f72ab896
  由 tensor-tang 提交于 8月 22, 2018
  
  f72ab896
16 8月, 2018 1 次提交
- T
  
  add mklml vmul · 6644ce79
  由 tensor-tang 提交于 8月 16, 2018
  
  6644ce79
06 8月, 2018 1 次提交
- T
  
  fix blas · 54c95e49
  由 tensor-tang 提交于 8月 06, 2018
  
  54c95e49
03 8月, 2018 1 次提交
- T
  
  add mkl packed gemm · 43cee33a
  由 tensor-tang 提交于 8月 02, 2018
  
  43cee33a
18 7月, 2018 2 次提交
- T
  
  refine gemm · a916c525
  由 tensor-tang 提交于 7月 18, 2018
  
  a916c525
- T
  
  mkl split gemm for better perf · 961e754c
  由 tensor-tang 提交于 7月 18, 2018
  
  961e754c
11 7月, 2018 2 次提交
- T
  
  disable xsmm with float16 · 1c5d6c56
  由 tensor-tang 提交于 7月 11, 2018
  
  1c5d6c56
- T
  
  refine the threshold functions · 64a8e6d2
  由 tensor-tang 提交于 7月 11, 2018
  
  64a8e6d2
10 7月, 2018 1 次提交
- T
  
  refine the ColMajor replacement · 6bc1aaaa
  由 tensor-tang 提交于 7月 10, 2018
  
  6bc1aaaa
09 7月, 2018 1 次提交
- T
  
  fix ColMajor and RowMajor replacement · de856da9
  由 tensor-tang 提交于 7月 09, 2018
  
  de856da9
05 7月, 2018 1 次提交
- T
  
  add libxsmm_gemm · c3941745
  由 tensor-tang 提交于 7月 05, 2018
  
  c3941745
20 6月, 2018 1 次提交
- T
  
  enable dynamic load mklml lib on fluid · f503f129
  由 tensor-tang 提交于 6月 20, 2018
  
  f503f129
24 5月, 2018 1 次提交
- T
  
  MKL elementwise add: elementwise_add uses vAdd VML function when MKL is used · e43c8f33
  由 Tomasz Patejko 提交于 5月 17, 2018
  
  e43c8f33
11 5月, 2018 1 次提交
- Y
  
  Fix typo in blas_impl.h · 66590a0b
  由 yuyang18 提交于 5月 11, 2018
  
  66590a0b
08 5月, 2018 1 次提交
- Y
  Move MatMul to blas_impl.h · 0a13d3c6
  由 Yu Yang 提交于 5月 08, 2018
```
Rename MatDim to MatDescriptor
```
  0a13d3c6
04 5月, 2018 1 次提交
- Y
  
  Clean and extract blas · ef6ea790
  由 Yu Yang 提交于 5月 04, 2018
  
  ef6ea790
03 5月, 2018 1 次提交
- Y
  
  Clean MatMul · 815d8884
  由 Yu Yang 提交于 5月 03, 2018
  
  815d8884
02 5月, 2018 1 次提交
- Y
  
  Naive implement cblas · 4db43c6c
  由 Yu Yang 提交于 5月 02, 2018
  
  4db43c6c
28 4月, 2018 1 次提交
- Y
  
  Refactor GEMM in blas · c888e016
  由 Yu Yang 提交于 4月 28, 2018
  
  c888e016

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致