提交 · 25df89293295737437d64e085a048e7c8bd815b4 · s920243400 / PaddleDetection

18 10月, 2017 1 次提交

由 Markus Kliegl 提交于 10月 17, 2017

* initial matmul operator

Similar to np.matmul, but also has transpose_X and transpose_Y flags,
and only supports tensors from rank 1 to 3 inclusive.

For GPU, uses cublas?gemmStridedBatched. For CPU, uses
cblas_?gemm_batch if available via MKL; otherwise a simple serial
implementation that loops over the batch dimension is employed for now.

16489827

10 8月, 2017 3 次提交
- Q
  
  format code · 688c43b1
  由 qijun 提交于 8月 10, 2017
  
  688c43b1
- Y
  Fix gaussian_random_op compile error · 45911102
  由 Yu Yang 提交于 8月 10, 2017
```
* Should always use `dynload::` for cuda function.
* Fix cublas.h without DSO load.
```
  45911102
- Q
  
  fix bug in dynload · 5f1081d8
  由 qijun 提交于 8月 10, 2017
  
  5f1081d8
13 7月, 2017 1 次提交
- Q
  
  fix bug in dynload · 4e918377
  由 qijun 提交于 7月 13, 2017
  
  4e918377
11 7月, 2017 2 次提交
- Q
  
  fix cublas dynload bug · 69d76812
  由 qijun 提交于 7月 11, 2017
  
  69d76812
- Y
  
  Refine CUDA Related libraries · a0466053
  由 Yu Yang 提交于 7月 11, 2017
  
  a0466053
04 7月, 2017 3 次提交
- Q
  
  fix wrong including header-file in files in paddle/platform/dynload dir · e6fcdd47
  由 qijun 提交于 7月 04, 2017
  
  e6fcdd47
- Q
  
  move to dynload directory · 3567ea6d
  由 qijun 提交于 7月 04, 2017
  
  3567ea6d
- Q
  
  follow comments · 9eeabe98
  由 qijun 提交于 7月 04, 2017
  
  9eeabe98
03 7月, 2017 2 次提交
- Q
  
  fix cuda compile error · a77fcef3
  由 qijun 提交于 7月 03, 2017
  
  a77fcef3
- Q
  
  add dynamic_load · 3ba7a738
  由 qijun 提交于 7月 03, 2017
  
  3ba7a738

s920243400 / PaddleDetection 与 Fork 源项目一致

s920243400 / PaddleDetection
与 Fork 源项目一致