提交 · 6dc0e663f42ddf9ac59eb047b4b64b265b7e2c2b · 机器未来 / Paddle

12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

14 11月, 2017 1 次提交

Fix matmal_op for debug mode · 6a6e4d8d

由 xuwei06 提交于 11月 10, 2017

The dimension is not set correctly and is not being checked in release mode because eigen_assert is not enabled.

6a6e4d8d

11 11月, 2017 1 次提交
- D
  
  Use G++ to compile some cu operators. · f5e36765
  由 dangqingqing 提交于 11月 11, 2017
  
  f5e36765
20 10月, 2017 1 次提交

Remove template parameter for Tensor methods (#4937) · c532b967

由 Yu Yang 提交于 10月 19, 2017

* Remove template parameter for Tensor methods

* Also check the type is correct when data()
* Simplize holder_

* Fix accuracy_op

* Register Code

c532b967

18 10月, 2017 1 次提交

MatMul operator (#4856) · 16489827

由 Markus Kliegl 提交于 10月 17, 2017

* initial matmul operator

Similar to np.matmul, but also has transpose_X and transpose_Y flags,
and only supports tensors from rank 1 to 3 inclusive.

For GPU, uses cublas?gemmStridedBatched. For CPU, uses
cblas_?gemm_batch if available via MKL; otherwise a simple serial
implementation that loops over the batch dimension is employed for now.

16489827

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致