Merge pull request #20 from kuke/infer_revert

Remove inference doc temporarily

Merge pull request #20 from kuke/infer_revert
Remove inference doc temporarily
9eb21f38 · Yibing Liu · GitHub · d6936704 · 792f56be · 9eb21f38
隐藏空白更改
内联并排

Showing with 3 addition and 139 deletion

BERT/README.md BERT/README.md +3 -78

BERT/inference/README.md BERT/inference/README.md +0 -61

未找到文件。
--- a/BERT/README.md
+++ b/BERT/README.md
@@ -6,11 +6,11 @@

 ### 发布要点

-1）完整支持 BERT 模型训练到部署, 包括:
+1）完整支持 BERT 模型训练, 包括:

 - 支持 BERT GPU 单机、分布式预训练
 - 支持 BERT GPU 多卡 Fine-tuning
- 提供 BERT 预测接口 demo, 方便多硬件设备生产环境的部署
+

 2）支持 FP16/FP32 混合精度训练和 Fine-tuning，节省显存开销、加速训练过程；

@@ -42,9 +42,7 @@
  - [阅读理解 SQuAD](#阅读理解-squad)
 - [**混合精度训练**: 利用混合精度加速训练](#混合精度训练)
 - [**模型转换**: 如何将 BERT TensorFlow 模型转换为 Paddle Fluid 模型](#模型转换)
- [**模型部署**: 多硬件环境模型部署支持](#模型部署)
-  - [产出用于部署的 inference model](#保存-inference-model)
-  - [inference 接口调用示例](#inference-接口调用示例)
+

 ## 安装
 本项目依赖于 Paddle Fluid 1.3，请参考[安装指南](http://www.paddlepaddle.org/#quick-start)进行安装。
@@ -320,79 +318,6 @@ python convert_params.py \

 **注意**：要成功运行转换脚本，需同时安装 TensorFlow 和 Paddle Fluid 1.3。

-## 模型部署
-
-深度学习模型需要应用于实际情景，则需要进行模型的部署，把训练好的模型部署到不同的机器上去，这需要考虑不同的硬件环境，包括 GPU/CPU 的环境，单机/分布式集群，或者嵌入式设备；同时还要考虑软件环境，比如部署的机器上是否都安装了对应的深度学习框架；还要考虑运行性能等。但是要求部署环境都安装整个框架会给部署带来不便，为了解决深度学习模型的部署，一种可行的方案是使得模型可以脱离框架运行，Paddle Fluid 采用这种方法进行部署，编译 [Paddle Fluid inference](http://paddlepaddle.org/documentation/docs/zh/1.2/advanced_usage/deploy/inference/build_and_install_lib_cn.html) 库，并且编写加载模型的 `C++` inference 接口。预测的时候则只要加载保存的预测网络结构和模型参数，就可以对输入数据进行预测，不再需要整个框架而只需要 Paddle Fluid inference 库，这带来了模型部署的灵活性。
-
-以语句和语句对分类任务为例子，下面讲述如何进行模型部署。首先需要进行模型的训练，其次是要保存用于部署的模型。最后编写 `C++` inference 程序加载模型和参数进行预测。
-
-前面 [语句和句对分类任务](#语句和句对分类任务) 一节中讲到了如何训练 XNLI 任务的模型，并且保存了 checkpoints。但是值得注意的是这些 checkpoint 中只是包含了模型参数以及对于训练过程中必要的状态信息（参见 [params](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/io_cn.html#save-params) 和 [persistables](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/io_cn.html#save-persistables) ), 现在要生成预测用的 [inference model](http://paddlepaddle.org/documentation/docs/zh/1.2/api_cn/io_cn.html#permalink-5-save_inference_model)，可以按照下面的步骤进行。
-
-### 保存 inference model
-
-```shell
-BERT_BASE_PATH="chinese_L-12_H-768_A-12"
-TASK_NAME="XNLI"
-DATA_PATH=/path/to/xnli/data/
-INIT_CKPT_PATH=/path/to/a/finetuned/checkpoint/
-
-python -u predict_classifier.py --task_name ${TASK_NAME} \
-       --use_cuda true \
-       --batch_size 64 \
-       --data_dir ${DATA_PATH} \
-       --vocab_path ${BERT_BASE_PATH}/vocab.txt \
-       --init_checkpoint ${INIT_CKPT_PATH} \
-       --do_lower_case true \
-       --max_seq_len 128 \
-       --bert_config_path ${BERT_BASE_PATH}/bert_config.json \
-       --do_predict true \
-       --save_inference_model_path ${INIT_CKPT_PATH}
-```
-
-以上的脚本完成可以两部分工作：
-
-1. 从某一个 `init_checkpoint` 加载模型参数，此时如果设定参数 `--do_predict` 为 `true` 则在 `test` 数据集上进行测试，输出预测结果。
-2. 生成对应于 `init_checkpoint` 的 inference model，这会被保存在 `${INIT_CKPT_PATH}/{CKPT_NAME}_inference_model` 目录。
-
-### inference 接口调用示例
-
-使用 `C++` 进行预测的过程需要使用 Paddle Fluid inference 库，具体的使用例子参考 [`inference`](./inference) 目录下的 `README.md`.
-
-下面的代码演示了如何使用 `C++` 进行预测，更多细节请见 [`inference`](./inference) 目录下的例子，可以参考例子写 inference。
-
-``` cpp
-#include <paddle_inference_api.h>
-
-// create and set configuration
-paddle::NativeConfig config;
-config.model_dir = "xxx";
-config.use_gpu = false;
-
-// create predictor
-auto predictor = CreatePaddlePredictor(config);
-
-// create input tensors
-paddle::PaddleTensor src_id;
-src.dtype = paddle::PaddleDType::INT64;
-src.shape = ...;
-src.data.Reset(...);
-
-paddle::PaddleTensor pos_id;
-paddle::PaddleTensor segmeng_id;
-paddle::PaddleTensor self_attention_bias;
-paddle::PaddleTensor next_segment_index;
-
-// create iutput tensors and run prediction
-std::vector<paddle::PaddleTensor> output;
-predictor->Run({src_id, pos_id, segmeng_id, self_attention_bias, next_segment_index}, &output);
-
-std::cout << "example_id\tcontradiction\tentailment\tneutral";
-for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) {
-  std::cout << static_cast<float *>(output.front().data.data())[i] << "\t"
-            << static_cast<float *>(output.front().data.data())[i + 1] << "\t"
-            << static_cast<float *>(output.front().data.data())[i + 2] << std::endl;
-}
-```

 ## Contributors


--- a/BERT/inference/README.md
+++ b/BERT/inference/README.md
-# BERT模型inference demo
-
-## 数据预处理
-实际应用场景中，模型部署之后用户还需要编写对应的程序对输入进行处理，然后把得到的数据传给模型进行预测。这里为了演示的需要，用 `gen_demo_data.py` 来进行数据处理，包括 tokenization，batching，numericalization，并且把处理后的数据输出为文本文件。使用方法如下：
-
-``` bash
-TASK_NAME="xnli"
-DATA_PATH=/path/to/xnli/data/
-BERT_BASE_PATH=/path/to/bert/pretrained/model/
-python gen_demo_data.py \
-    --task_name ${TASK_NAME} \
-    --data_path ${DATA_PATH} \
-    --vocab_path "${BERT_BASE_PATH}/vocab.txt" \
-    --batch_size 4096 \
-    --in_tokens \
-    > data.txt
-```
-
-**生成的数据格式**
-
-生成的数据一行代表一个 `batch`, 包含五个字段
-
-```text
-src_id, pos_id, segment_id, self_attention_bias, next_segment_index
-```
-
-字段之间按照分号(;)分隔，其中各字段内部 `shape` 和 `data` 按照冒号(:)分隔，`shape` 和 `data` 内部按空格分隔，`self_attention_bias` 为 FLOAT32 类型，其余字段为 INT64 类型。
-
-## 编译和运行
-
-为了编译 inference demo，`C++` 编译器需要支持 `C++11` 标准。
-
-首先下载对应的 [PaddlePaddle预测库](http://paddlepaddle.org/documentation/docs/zh/1.3/advanced_usage/deploy/inference/build_and_install_lib_cn.html) , 根据使用的 paddle 的版本和配置状况 (是否使用 avx, mkl, 以及 cuda, cudnn 版本) 选择下载对应的版本，并解压至 `inference` 目录，会得到 `fluid_inference` 子目录。
-
-假设`paddle_infer_lib_path`是刚才解压得到的`fluid_inference`子目录的绝对路径，设置运行相关的环境变量(以 `cpu_avx_mkl` 版本为例)
-
-``` bash
-LD_LIBRARY_PATH=${paddle_infer_lib_path}/paddle/lib/:$LD_LIBRARY_PATH
-LD_LIBRARY_PATH=${paddle_infer_lib_path}/third_party/install/mklml/lib:$LD_LIBRARY_PATH
-LD_LIBRARY_PATH=${paddle_infer_lib_path}/third_party/install/mkldnn/lib:$LD_LIBRARY_PATH
-export LD_LIBRARY_PATH
-```
-
-编译 demo
-
-``` bash
-mkdir build && cd build
-cmake .. -DFLUID_INFER_LIB=${paddle_infer_lib_path}
-make
-```
-
-这会在 `build` 目录下生成运行 `inference` 可执行文件。
-
-运行 demo
-
-```bash
-./inference --logtostderr \
-    --model_dir $MODEL_PATH \
-    --data $DATA_PATH \
-    --repeat $REPEAT_NUM
-```