high_level_api.md 2.1 KB
Newer Older
Y
Yan Chunwei 已提交
1
# Inference High-level APIs
Y
Yan Chunwei 已提交
2
This document describes the high-level inference APIs, one can use them to deploy a Paddle model for an application quickly.
Y
Yan Chunwei 已提交
3

4
The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_inference.so` and `libpaddle_inference_io.so` are needed for a deployment.
Y
Yan Chunwei 已提交
5 6

## PaddleTensor
Y
Yan Chunwei 已提交
7
We provide the `PaddleTensor` data structure to give a general tensor interface.
Y
Yan Chunwei 已提交
8 9 10 11 12 13 14 15 16 17 18 19

The definition is 

```c++
struct PaddleTensor {
  std::string name;  // variable name.
  std::vector<int> shape;
  PaddleBuf data;  // blob of data.
  PaddleDType dtype;
};
```

Y
Yan Chunwei 已提交
20 21 22
The data is stored in a continuous memory `PaddleBuf,` and a `PaddleDType` specifies tensor's data type. 
The `name` field is used to specify the name of an input variable, 
that is important when there are multiple inputs and need to distinguish which variable to set.
Y
Yan Chunwei 已提交
23 24

## engine
Y
Yan Chunwei 已提交
25
The inference APIs has two different underlying engines
Y
Yan Chunwei 已提交
26

27 28
- the native engine
- the tensorrt engine
Y
Yan Chunwei 已提交
29

30 31
The native engine, which is consists of the native operators and framework, takes a native Paddle model
as input, and supports any model that trained by Paddle.
Y
Yan Chunwei 已提交
32 33 34 35

```c++
enum class PaddleEngineKind {
  kNative = 0,  // Use the native Fluid facility.
36
  kAutoMixedTensorRT // Automatically mixing TensorRT with the Fluid ops.
Y
Yan Chunwei 已提交
37 38 39 40
};
```

## PaddlePredictor and how to create one
Y
Yan Chunwei 已提交
41
The main interface is `PaddlePredictor,` there are following methods 
Y
Yan Chunwei 已提交
42 43

- `bool Run(const std::vector<PaddleTensor>& inputs, std::vector<PaddleTensor>* output_data)`
Y
Yan Chunwei 已提交
44
  - take inputs and output `output_data.`
Y
Yan Chunwei 已提交
45 46 47 48 49 50 51 52 53
- `Clone` to clone a predictor from an existing one, with model parameter shared.

There is a factory method to help create a predictor, and the user takes the ownership of this object.

```c++
template <typename ConfigT, PaddleEngineKind engine = PaddleEngineKind::kNative>
std::unique_ptr<PaddlePredictor> CreatePaddlePredictor(const ConfigT& config);
```

Y
Yan Chunwei 已提交
54
By specifying the engine kind and config, one can get a specific implementation.
Y
Yan Chunwei 已提交
55 56 57 58

## Reference

- [paddle_inference_api.h](./paddle_inference_api.h)
L
Luo Tao 已提交
59
- [some demos](./demo_ci)