This document describes the high-level inference APIs one can use to easily deploy a Paddle model for an application.
This document describes the high-level inference APIs, one can use them to deploy a Paddle model for an application quickly.
The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_fluid.so` and `libpaddle_fluid_api.so` are needed.
The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_fluid.so` and `libpaddle_fluid_api.so` are needed for a deployment.
## PaddleTensor
## PaddleTensor
We provide the `PaddleTensor` data structure is to give a general tensor interface.
We provide the `PaddleTensor` data structure to give a general tensor interface.
The definition is
The definition is
...
@@ -17,18 +17,19 @@ struct PaddleTensor {
...
@@ -17,18 +17,19 @@ struct PaddleTensor {
};
};
```
```
The data is stored in a continuous memory `PaddleBuf`, and tensor's data type is specified by a `PaddleDType`.
The data is stored in a continuous memory `PaddleBuf,` and a `PaddleDType` specifies tensor's data type.
The `name` field is used to specify the name of input variable,
The `name` field is used to specify the name of an input variable,
that is important when there are multiple inputs and need to distiuish which variable to set.
that is important when there are multiple inputs and need to distinguish which variable to set.
## engine
## engine
The inference APIs has two different underlying implementation, currently there are two valid engines:
The inference APIs has two different underlying engines
- the native engine, which is consists of the native operators and framework,
- the native engine, which is consists of the native operators and framework,
- the Anakin engine, which is a Anakin library embeded.
- the Anakin engine, which has an Anakin library embedded.
The native engine takes a native Paddle model as input, and supports any model that trained by Paddle,
The native engine takes a native Paddle model as input, and supports any model that trained by Paddle,
but the Anakin engine can only take the Anakin model as input(user need to manully transform the format first) and currently not all Paddle models are supported.
the Anakin engine is faster for some model,
but it can only take the Anakin model as input(user need to transform the format first manually) and currently not all Paddle models are supported.
```c++
```c++
enumclassPaddleEngineKind{
enumclassPaddleEngineKind{
...
@@ -38,10 +39,10 @@ enum class PaddleEngineKind {
...
@@ -38,10 +39,10 @@ enum class PaddleEngineKind {
```
```
## PaddlePredictor and how to create one
## PaddlePredictor and how to create one
The main interface is `PaddlePredictor`, there are following methods
The main interface is `PaddlePredictor,` there are following methods