提交 · 40c7972d4619564ad3c09cf8774796cc523c2ae9 · PaddlePaddle / Paddle

10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
21 1月, 2018 1 次提交
- C
  
  follow comments · 782ddc5f
  由 chengduoZH 提交于 1月 21, 2018
  
  782ddc5f
19 1月, 2018 1 次提交
- C
  
  follow comments · 0468422d
  由 chengduoZH 提交于 1月 19, 2018
  
  0468422d
18 1月, 2018 3 次提交
- C
  
  modify doc · 259858b4
  由 chengduoZH 提交于 1月 18, 2018
  
  259858b4
- C
  
  code refine · 578d60bf
  由 chengduoZH 提交于 1月 18, 2018
  
  578d60bf
- C
  
  add 4-d for matmul_op · 2edc136c
  由 chengduoZH 提交于 1月 18, 2018
  
  2edc136c
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

18 10月, 2017 1 次提交

MatMul operator (#4856) · 16489827

由 Markus Kliegl 提交于 10月 17, 2017

* initial matmul operator

Similar to np.matmul, but also has transpose_X and transpose_Y flags,
and only supports tensors from rank 1 to 3 inclusive.

For GPU, uses cublas?gemmStridedBatched. For CPU, uses
cblas_?gemm_batch if available via MKL; otherwise a simple serial
implementation that loops over the batch dimension is employed for now.

16489827

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功