JIT(Just In Time) Kernel contains actually generated code and some other implemenations with the same logic.
JIT(Just In Time) Kernel contains actually generated code and some other implemenations with the same logic.
Each implementations has its own condition to use, defined in `UseMe`.
Each implementation has its own condition to use, defined in `CanBeUsed`.
They are combined together to get the best performance of one single independent function.
They are combined together to get the best performance of one single independent function.
They could be some very simple functions like vector multiply, or some complicated functions like LSTM.
They could be some very simple functions like vector multiply, or some complicated functions like LSTM.
And they can be composed with some other exited jit kernels to build up a complex function.
And they can be composed with some other exited jit kernels to build up a complex function.
...
@@ -42,35 +42,62 @@ All basical definations of jit kernels are addressed in `paddle/fluid/operators/
...
@@ -42,35 +42,62 @@ All basical definations of jit kernels are addressed in `paddle/fluid/operators/
## How to use
## How to use
One simple function `jit::Get`, which is very easy to use, is supported to get the kernel.
We present these methods to get the functions:
It can automatically return the expected function with best performance under the given attributes.
-`GetAllCandidateFuncs`. It can return all the implementations supported. All of the implementations can get the same result. You can do some runtime benchmark to choose which should actually be used.
All kernels are inlcuded in `paddle/fluid/operators/jit/kernels.h`, you can only include this one header to get all the registered kernels.
-`GetDefaultBestFunc`. It only return one default function pointer, which is tuning offline with some genenal configures and attributes. This should cover most situations.
-`KernelFuncs::Cache()`. It can get the default functions and save it for next time with the same attribute.
-`GetReferFunc`. It can only get the reference code in CPU, and all the others implementations have same logic with this reference code.
All kernels are inlcuded in `paddle/fluid/operators/jit/kernels.h`, which is automatically generated in compile time, you can only include this one header to get all the registered kernels.
## Solid Test
## Solid Test
- Unit Test
- Unit Test
All functions should be compared with the corresponding reference functions, including data tyep `float` and `double`.
All functions should be compared with the corresponding reference functions, including data tyep `float` and `double`.
- Benchmark
- Benchmark
All functions should be tested, and make sure the `jit::Get` function obtain the best performance with all attributes.
All functions should be tested, and make sure the `jit::GetDefaultBestFunc` function obtain the best performance with all attributes.
# How to add new kernel
# How to add new kernel
## Required
## Required
1. Add `your_key` at `KernelType`.
1. Add `your_key` at `KernelType`.
2. Add reference function of `your_key`.
2. Add your new `KernelTuple` which must include `your_key`. It should be a combination of the data type, attribute type and function type. You can refer `SeqPoolTuple`.
3. Add reference function of `your_key`.
Note:
Note:
- this should be run on CPU and do not depend on any third-party.
- this should be run on CPU and do not depend on any third-party.
- Add `USE_JITKERNEL_REFER(your_key)` in `refer/CmakeLists.txt` to make sure this code can be used.
- Add `USE_JITKERNEL_REFER(your_key)` in `refer/CmakeLists.txt` to make sure this code can be used.
3. Add unit test in `test.cc`, and verfiy at least `float` and `double`.
4. Add unit test in `test.cc`, and verfiy at least `float` and `double`.
Test more data type for some special functions if necessary, for example `int8`.
Test more data type for some special functions if necessary, for example `int8`.
4. Add functions in `benchmark.cc` to test all function of same `KernelType`. Make sure `jit::Get` always get the best one.
5. Add functions in `benchmark.cc` to test all function of same `KernelType`. Make sure `GetDefaultBestFunc` always get the best one.
## Optional
## Optional
Add more implementations of `your_kery` for performance enhancement.
Add more implementations of `your_kery` for performance enhancement.
1. Add functions based on generated code in `gen`. It should be derived from `JitCode` and should have corepsonding creator from `JitCodeCreator` which will be registered on the `your_key`.
1. Add functions based on generated code in `gen`. It should be derived from `JitCode` and should have correpsonding creator from `JitCodeCreator` which will be registered on the `your_key`.
Note: Add new `KernelTuples` if necessary,your can refer to `XYZNTuples`.
2. If new attribute type is added, you should specialize `JitCodeKey` of this type.
Specialie method `JitCodeKey` when add new attribute type。
3. Add more functions in `more`,you can use any third party you wish, like mkl, mkldnn or intrinsic code to reach the best performance.
2. Add more functions in `more`,you can use any third party you wish, like mkl, mkldnn or intrinsic code to reach the best performance.