new_op_kernel.md 4.5 KB
Newer Older
W
weixing 已提交
1
# Add Kernels for a New Device
Q
Qiao Longfei 已提交
2

W
weixing 已提交
3
## Background
Q
Qiao Longfei 已提交
4 5 6

PaddlePaddle Fluid have hundreds of operators.  Each operator could have one or more kernels.  A kernel is an implementation of the operator for a certain device, which could be a hardware device, e.g., the CUDA GPU, or a library that utilizes a device, e.g., Intel MKL that makes full use of the Xeon CPU.

W
weixing02 已提交
7
[This document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/new_op_en.md) explains how to add an operator, and its kernels.  The kernels of an operator are indexed by a C++ type [`OpKernelType`](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/multi_devices/operator_kernel_type.md).  An operator chooses the right kernel at runtime.  This choosing mechanism is described [here](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/execution/switch.md).
Q
Qiao Longfei 已提交
8

W
weixing 已提交
9
## Write Kernels for A New Device
Q
Qiao Longfei 已提交
10

W
weixing 已提交
11
### Add A New Device
Q
Qiao Longfei 已提交
12

W
weixing02 已提交
13
  For some historical reaons, we misuse the word *library* for *device*.  For example, we call the deivce type by *library type*.  An example is the header file [`library_type.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/library_type.h#L24).  We will correct this ASAP.
Q
Qiao Longfei 已提交
14 15 16 17 18 19 20 21 22 23 24 25

To register a new device, we need to add an enum value to `LibraryType`:

```
enum class LibraryType {
  kPlain = 0,
  kMKLDNN = 1,
  kCUDNN = 2,
};
```


W
weixing02 已提交
26
### Add A New [Place](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/place.h#L53)
Q
Qiao Longfei 已提交
27

W
weixing02 已提交
28
If you have a new kind of Device, firstly you need to add a new kind of [`Place`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/place.h#L53). For example `CUDAPlace`:
Q
Qiao Longfei 已提交
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

```cpp
struct CUDAPlace {
  CUDAPlace() : CUDAPlace(0) {}
  explicit CUDAPlace(int d) : device(d) {}

  inline int GetDeviceId() const { return device; }
  // needed for variant equality comparison
  inline bool operator==(const CUDAPlace &o) const {
    return device == o.device;
  }
  inline bool operator!=(const CUDAPlace &o) const { return !(*this == o); }

  int device;
};

typedef boost::variant<CUDAPlace, CPUPlace> Place;
```

W
weixing02 已提交
48 49
### Add [device context]((https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/device_context.h#L37))
After a new kind of Device is added, you should add a corresponding [DeviceContext](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/device_context.h#L37) for it.
Q
Qiao Longfei 已提交
50 51 52 53 54 55 56 57 58 59 60

```cpp
class DeviceContext {
 public:
  virtual ~DeviceContext() {}
  virtual Place GetPlace() const = 0;

  virtual void Wait() const {}
};
```

W
weixing02 已提交
61
### Implement new [OpKernel](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.h#L351) for your Device.
Q
Qiao Longfei 已提交
62

W
weixing02 已提交
63
A detailed documentation can be found in [`new_op_and_kernel`](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/new_op_en.md)
Q
Qiao Longfei 已提交
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87

```cpp
class OpKernelBase {
 public:
  /**
   * ExecutionContext is the only parameter of Kernel Run function.
   * Run will get input/output variables, state such as momentum and
   * device resource such as CUDA stream, cublas handle, etc. from
   * ExecutionContext. User should construct it before run the Operator.
   */

  virtual void Compute(const ExecutionContext& context) const = 0;

  virtual ~OpKernelBase() = default;
};

template <typename T>
class OpKernel : public OpKernelBase {
 public:
  using ELEMENT_TYPE = T;
};
```


W
weixing 已提交
88
### Register the OpKernel to framework
Q
Qiao Longfei 已提交
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103

After writing the components described above, we should register the kernel to the framework.

We use `REGISTER_OP_KERNEL` to do the registration.

```cpp
REGISTER_OP_KERNEL(
	op_type,
	library_type,
	place_type,
	kernel0, kernel1, ...)
```

kernel0, kernel1 are kernels that have the same `op_type`, `library_type`, `place_type` but different `data_types`.

W
weixing02 已提交
104
take [`conv2d`]((https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/conv_cudnn_op.cu.cc#L318)) as an example:
Q
Qiao Longfei 已提交
105 106 107 108 109

	```cpp
	REGISTER_OP_KERNEL(conv2d, CPU, paddle::platform::CPUPlace,
    		paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>,
    		paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double>);
W
weixing 已提交
110

Q
Qiao Longfei 已提交
111 112 113 114 115 116 117 118 119 120 121
	REGISTER_OP_KERNEL(conv2d, CUDNN, ::paddle::platform::CUDAPlace,
	       paddle::operators::CUDNNConvOpKernel<float>,
	       paddle::operators::CUDNNConvOpKernel<double>);
	```

In the code above:

 - `conv2d` is the type/name of the operator
 - `CUDNN/CPU` is `library`
 - `paddle::platform::CUDAPlace/CPUPlace` is `place`
 - template parameter `float/double` on `CUDNNConvOpKernel<T>` is `data_type`.