high_level_api.md 2.2 KB
Newer Older
Y
Yan Chunwei 已提交
1
# Inference High-level APIs
Y
Yan Chunwei 已提交
2
This document describes the high-level inference APIs, one can use them to deploy a Paddle model for an application quickly.
Y
Yan Chunwei 已提交
3

Y
Yan Chunwei 已提交
4
The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_fluid.so` and `libpaddle_fluid_api.so` are needed for a deployment.
Y
Yan Chunwei 已提交
5 6

## PaddleTensor
Y
Yan Chunwei 已提交
7
We provide the `PaddleTensor` data structure to give a general tensor interface.
Y
Yan Chunwei 已提交
8 9 10 11 12 13 14 15 16 17 18 19

The definition is 

```c++
struct PaddleTensor {
  std::string name;  // variable name.
  std::vector<int> shape;
  PaddleBuf data;  // blob of data.
  PaddleDType dtype;
};
```

Y
Yan Chunwei 已提交
20 21 22
The data is stored in a continuous memory `PaddleBuf,` and a `PaddleDType` specifies tensor's data type. 
The `name` field is used to specify the name of an input variable, 
that is important when there are multiple inputs and need to distinguish which variable to set.
Y
Yan Chunwei 已提交
23 24

## engine
Y
Yan Chunwei 已提交
25
The inference APIs has two different underlying engines
Y
Yan Chunwei 已提交
26 27

- the native engine, which is consists of the native operators and framework,
Y
Yan Chunwei 已提交
28
- the Anakin engine, which has an Anakin library embedded.
Y
Yan Chunwei 已提交
29 30

The native engine takes a native Paddle model as input, and supports any model that trained by Paddle, 
Y
Yan Chunwei 已提交
31 32
the Anakin engine is faster for some model, 
but it can only take the Anakin model as input(user need to transform the format first manually) and currently not all Paddle models are supported.
Y
Yan Chunwei 已提交
33 34 35 36 37 38 39 40 41

```c++
enum class PaddleEngineKind {
  kNative = 0,  // Use the native Fluid facility.
  kAnakin,      // Use Anakin for inference.
};
```

## PaddlePredictor and how to create one
Y
Yan Chunwei 已提交
42
The main interface is `PaddlePredictor,` there are following methods 
Y
Yan Chunwei 已提交
43 44

- `bool Run(const std::vector<PaddleTensor>& inputs, std::vector<PaddleTensor>* output_data)`
Y
Yan Chunwei 已提交
45
  - take inputs and output `output_data.`
Y
Yan Chunwei 已提交
46 47 48 49 50 51 52 53 54
- `Clone` to clone a predictor from an existing one, with model parameter shared.

There is a factory method to help create a predictor, and the user takes the ownership of this object.

```c++
template <typename ConfigT, PaddleEngineKind engine = PaddleEngineKind::kNative>
std::unique_ptr<PaddlePredictor> CreatePaddlePredictor(const ConfigT& config);
```

Y
Yan Chunwei 已提交
55
By specifying the engine kind and config, one can get a specific implementation.
Y
Yan Chunwei 已提交
56 57 58 59

## Reference

- [paddle_inference_api.h](./paddle_inference_api.h)
Y
Yan Chunwei 已提交
60
- [some demos](./demo)