inference_support_in_fluid.md 14.7 KB
Newer Older
W
weixing02 已提交
1 2
# Fluid Inference使用指南

W
weixing02 已提交
3 4
## 目录:

W
weixing02 已提交
5 6 7 8 9 10 11
- Python Inference API
- 编译Fluid Inference库
- Inference C++ API
- Inference实例
- Inference计算优化

## Python Inference API **[改进中]**
W
weixing02 已提交
12
- 保存Inference模型 ([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L295))
W
weixing02 已提交
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

  ```python
  def save_inference_model(dirname,
                           feeded_var_names,
                           target_vars,
                           executor,
                           main_program=None,
                           model_filename=None,
                           params_filename=None):
  ```
  Inference模型和参数将会保存到`dirname`目录下:
  - 序列化的模型
    - `model_filename``None`,保存到`dirname/__model__`
    - `model_filename``None`,保存到`dirname/model_filename`
  - 参数
    - `params_filename``None`,单独保存到各个独立的文件,各文件以参数变量的名字命名
    - `params_filename``None`,保存到`dirname/params_filename`

- 两种存储格式
  - 参数保存到各个独立的文件
    - 如,设置`model_filename``None``params_filename``None`

    ```bash
    $ cd recognize_digits_conv.inference.model
    $ ls
    $ __model__ batch_norm_1.w_0 batch_norm_1.w_2 conv2d_2.w_0 conv2d_3.w_0 fc_1.w_0 batch_norm_1.b_0 batch_norm_1.w_1 conv2d_2.b_0 conv2d_3.b_0 fc_1.b_0
    ```
  - 参数保存到同一个文件
    - 如,设置`model_filename``None``params_filename``__params__`

    ```bash
    $ cd recognize_digits_conv.inference.model
    $ ls
    $ __model__ __params__
    ```
W
weixing02 已提交
48
- 加载Inference模型([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L380))
W
weixing02 已提交
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
  ```python
  def load_inference_model(dirname,
                           executor,
                           model_filename=None,
                           params_filename=None):
    ...
    return [program, feed_target_names, fetch_targets]
  ```


## 编译Fluid Inference库

  - **不需要额外的CMake选项**
    - 1、 配置CMake命令,更多配置请参考[源码编译PaddlePaddle](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html)
      ```bash
      $ git clone https://github.com/PaddlePaddle/Paddle.git
      $ cd Paddle
      $ mkdir build
      $ cd build
      $ cmake -DCMAKE_INSTALL_PREFIX=your/path/to/paddle_inference_lib \
          -DCMAKE_BUILD_TYPE=Release \
          -DWITH_PYTHON=ON \
          -DWITH_MKL=OFF \
          -DWITH_GPU=OFF \
          ..
      ```

    - 2、 编译PaddlePaddle
      ```bash
      $ make
      ```

    - 3、 部署。执行如下命令将PaddlePaddle Fluid Inference库部署到`your/path/to/paddle_inference_lib`目录。
      ```bash
      $ make inference_lib_dist
      ```

- 目录结构

  ```bash
  $ cd your/path/to/paddle_inference_lib
  $ tree
  .
  |-- paddle
  |   `-- fluid
  |       |-- framework
  |       |-- inference
  |       |   |-- io.h
  |       |   `-- libpaddle_fluid.so
  |       |-- memory
  |       |-- platform
  |       `-- string
  |-- third_party
  |   |-- eigen3
  |   `-- install
  |       |-- gflags
  |       |-- glog
  |       `-- protobuf
  `-- ...
  ```

  假设`PADDLE_ROOT=your/path/to/paddle_inference_lib`



## 链接Fluid Inference库
W
weixing02 已提交
115
- 示例项目([链接](https://github.com/luotao1/fluid_inference_example.git))
W
weixing02 已提交
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147

  - GCC配置
    ```bash
    $ g++ -o a.out -std=c++11 main.cc \
          -I${PADDLE_ROOT}/ \
          -I${PADDLE_ROOT}/third_party/install/gflags/include \
          -I${PADDLE_ROOT}/third_party/install/glog/include \
          -I${PADDLE_ROOT}/third_party/install/protobuf/include \
          -I${PADDLE_ROOT}/third_party/eigen3 \
          -L${PADDLE_ROOT}/paddle/fluid/inference -lpaddle_fluid \
          -lrt -ldl -lpthread
    ```

  - CMake配置
    ```cmake
    include_directories(${PADDLE_ROOT}/)
    include_directories(${PADDLE_ROOT}/third_party/install/gflags/include)
    include_directories(${PADDLE_ROOT}/third_party/install/glog/include)
    include_directories(${PADDLE_ROOT}/third_party/install/protobuf/include)
    include_directories(${PADDLE_ROOT}/third_party/eigen3)
    target_link_libraries(${TARGET_NAME}
                          ${PADDLE_ROOT}/paddle/fluid/inference/libpaddle_fluid.so
                          -lrt -ldl -lpthread)
    ```

  - 设置环境变量:
  `export LD_LIBRARY_PATH=${PADDLE_ROOT}/paddle/fluid/inference:$LD_LIBRARY_PATH`



## C++ Inference API

W
weixing02 已提交
148
- 推断流程([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_helper.h#L91))
W
weixing02 已提交
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246

  - 1、 初始化设备
    ```cpp
    #include "paddle/fluid/framework/init.h"
    paddle::framework::InitDevices(false);
    ```

  - 2、 定义place,executor,scope
    ```cpp
    auto place = paddle::platform::CPUPlace();
    auto executor = paddle::framework::Executor(place);
    auto* scope = new paddle::framework::Scope();
    ```

  - 3、 加载模型
    ```cpp
    #include "paddle/fluid/inference/io.h"
    auto inference_program = paddle::inference::Load(executor, *scope, dirname);
    // or
    auto inference_program = paddle::inference::Load(executor,
                                                     *scope,
                                                     dirname + "/" + model_filename,
                                                     dirname + "/" + params_filename);
    ```

  - 4、 获取`feed_target_names``fetch_target_names`
    ```cpp
    const std::vector<std::string>& feed_target_names = inference_program->GetFeedTargetNames();
    const std::vector<std::string>& fetch_target_names = inference_program->GetFetchTargetNames();
    ```

  - 5、 准备`feed`数据
    ```cpp
    #include "paddle/fluid/framework/lod_tensor.h"
    std::vector<paddle::framework::LoDTensor*> cpu_feeds;
    ...
    std::map<std::string, const paddle::framework::LoDTensor*> feed_targets;
    for (size_t i = 0; i < feed_target_names.size(); ++i) {
      // Please make sure that cpu_feeds[i] is right for feed_target_names[i]
      feed_targets[feed_target_names[i]] = cpu_feeds[i];
    }
    ```

  - 6、 定义`Tensor``fetch`结果
    ```cpp
    std::vector<paddle::framework::LoDTensor*> cpu_fetchs;
    std::map<std::string, paddle::framework::LoDTensor*> fetch_targets;
    for (size_t i = 0; i < fetch_target_names.size(); ++i) {
      fetch_targets[fetch_target_names[i]] = cpu_fetchs[i];
    }
    ```

  - 7、 执行`inference_program`
    ```cpp
    executor.Run(*inference_program, scope, feed_targets, fetch_targets);
    ```

  - 8、 使用`fetch`数据
    ```cpp
    for (size_t i = 0; i < cpu_fetchs.size(); ++i) {
      std::cout << "lod_i: " << cpu_fetchs[i]->lod();
      std::cout << "dims_i: " << cpu_fetchs[i]->dims();
      std::cout << "result:";
      float* output_ptr = cpu_fetchs[i]->data<float>();
      for (int j = 0; j < cpu_fetchs[i]->numel(); ++j) {
        std::cout << " " << output_ptr[j];
      }
      std::cout << std::endl;
    }
    ```
    针对不同的数据,4. - 8.可执行多次。

  - 9、 释放内存
    ```cpp
    delete scope;
    ```


- 接口说明

  ```cpp
  void Run(const ProgramDesc& program, Scope* scope,
           std::map<std::string, const LoDTensor*>& feed_targets,
           std::map<std::string, LoDTensor*>& fetch_targets,
           bool create_vars = true,
           const std::string& feed_holder_name = "feed",
           const std::string& fetch_holder_name = "fetch");
  ```
  - 使用Python API `save_inference_model`保存的`program`里面包含了`feed_op``fetch_op`,用户提供的`feed_targets``fetch_targets`必须和`inference_program`中的`feed_op``fetch_op`保持一致。
  - 用户提供的`feed_holder_name``fetch_holder_name`也必须和`inference_program``feed_op``fetch_op`保持一致,可使用`SetFeedHolderName``SetFetchHolderName`接口重新设置`inferece_program`
  - 默认情况下,除了`persistable`属性设置为`True``Variable`之外,每次执行`executor.Run`会创建一个局部`Scope`,并且在这个局部`Scope`中创建和销毁所有的`Variable`,以最小化空闲时的内存占用。
  - `persistable`属性为`True``Variable`有:
    - Operators的参数`w``b`
    - `feed_op`的输入变量
    - `fetch_op`的输出变量


- **不在每次执行时创建和销毁变量
W
weixing02 已提交
247
 ([PR](https://github.com/PaddlePaddle/Paddle/pull/9301))**
W
weixing02 已提交
248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263
  - 执行`inference_program`
    ```cpp
    // Call once
    executor.CreateVariables(*inference_program, scope, 0);
    // Call as many times as you like
    executor.Run(
        *inference_program, scope, feed_targets, fetch_targets, false);
    ```
  - **优点**
    - 节省了频繁创建、销毁变量的时间(约占每次`Run`总时间的1% ~ 12%)
    - 执行结束后可获取所有Operators的计算结果
  - **缺点**
    - 空闲时也会占用大量的内存
    - 在同一个`Scope`中,相同的变量名是公用同一块内存的,容易引起意想不到的错误


W
weixing02 已提交
264
- **不在每次执行时创建Op([PR](https://github.com/PaddlePaddle/Paddle/pull/9630))**
W
weixing02 已提交
265 266 267 268 269 270 271 272 273 274 275 276 277
  - 执行`inference_program`
    ```cpp
    // Call once
    auto ctx = executor.Prepare(*inference_program, 0);
    // Call as many times as you like if you have no need to change the inference_program
    executor.RunPreparedContext(ctx.get(), scope, feed_targets, fetch_targets);
    ```
  - **优点**
    - 节省了频繁创建、销毁Op的时间
  - **缺点**
    - 一旦修改了`inference_program`,则需要重新创建`ctx`


W
weixing02 已提交
278
- **多线程共享Parameters([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_multi_thread_helper.h))**
W
weixing02 已提交
279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314
  - 主线程
    - 1、 初始化设备
    - 2、 定义`place``executor``scope`
    - 3、 加载模型,得到`inference_program`
  - 从线程
    - **复制`inference_program`得到`copy_program`,修改`copy_program`的`feed_holder_name`和`fetch_holder_name`**
      ```cpp
      auto copy_program = std::unique_ptr<paddle::framework::ProgramDesc>(
                 new paddle::framework::ProgramDesc(*inference_program));
      std::string feed_holder_name = "feed_" + paddle::string::to_string(thread_id);
      std::string fetch_holder_name = "fetch_" + paddle::string::to_string(thread_id);
      copy_program->SetFeedHolderName(feed_holder_name);
      copy_program->SetFetchHolderName(fetch_holder_name);
      ```
    - 4、 获取`copy_program``feed_target_names``fetch_target_names`
    - 5、 准备feed数据,定义Tensor来fetch结果
    - 6、 执行`copy_program`
      ```cpp
      executor->Run(*copy_program, scope, feed_targets, fetch_targets, true, feed_holder_name, fetch_holder_name);
      ```
    - 7、 使用fetch数据
  - 主线程
    - 8、 释放资源


- 基本概念
  - 数据相关:
    - [Tensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/tensor.md),一个N维数组,数据可以是任意类型(int,float,double等)
    - [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/lod_tensor.md),带LoD(Level-of-Detail)即序列信息的Tensor
    - [Scope](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md),记录了变量Variable
  - 执行相关:
    - [Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/executor.md),无状态执行器,只跟设备相关
    - Place
      - CPUPlace,CPU设备
      - CUDAPlace,CUDA GPU设备
  - 神经网络表示:
W
weixing02 已提交
315
    - [Program](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/program.md).
W
weixing02 已提交
316

W
weixing02 已提交
317
    详细介绍请参考[**Paddle Fluid开发者指南**](https://github.com/lcy-seso/learning_notes/blob/master/Fluid/developer's_guid_for_Fluid/Developer's_Guide_to_Paddle_Fluid.md)
W
weixing02 已提交
318 319 320 321 322 323 324 325 326 327 328 329 330 331 332



## Inference实例

  1. fit a line: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_fit_a_line.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_fit_a_line.cc)
  1. image classification: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_image_classification.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_image_classification.cc)
  1. label semantic roles: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_label_semantic_roles.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_label_semantic_roles.cc)
  1. recognize digits: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recognize_digits.cc)
  1. recommender system: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recommender_system.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recommender_system.cc)
  1. understand sentiment: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_understand_sentiment.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_understand_sentiment.cc)
  1. word2vec: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_word2vec.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_word2vec.cc)


## Inference计算优化
W
weixing02 已提交
333
- 使用Python推理优化工具([inference_transpiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/inference_transpiler.py))
W
weixing02 已提交
334 335 336 337 338 339 340 341 342 343 344 345
  ```python
  class InferenceTranspiler:
    def transpile(self, program, place, scope=None):
        ...
        if scope is None:
            scope = global_scope()
        ...
  ```
  - 使用`InferenceTranspiler`将会直接修改`program`
  - 使用`InferenceTranspiler`会修改参数的值,请确保`program`的参数在`scope`内。
- 支持的优化
  - 融合batch_norm op的计算
W
weixing02 已提交
346
- 使用示例([链接](https://github.com/Xreki/Xreki.github.io/blob/master/fluid/inference/inference_transpiler.py))
W
weixing02 已提交
347 348 349 350 351 352 353 354 355 356 357
  ```python
  import paddle.fluid as fluid
  # NOTE: Applying the inference transpiler will change the inference_program.
  t = fluid.InferenceTranspiler()
  t.transpile(inference_program, place, inference_scope)
  ```




## 内存使用优化
W
weixing02 已提交
358
- 使用Python内存优化工具([memory_optimization_transipiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/memory_optimization_transpiler.py))
W
weixing02 已提交
359 360 361
  ```python
  fluid.memory_optimize(inference_program)
  ```