未验证 提交 cebf5092 编写于 作者: littletomatodonkey's avatar littletomatodonkey 提交者: GitHub

add trt infer doc (#7870)

* add trt infer

* fix ch
上级 0e52a280
...@@ -11,6 +11,7 @@ ...@@ -11,6 +11,7 @@
- [2.3 多语言模型的推理](#23-多语言模型的推理) - [2.3 多语言模型的推理](#23-多语言模型的推理)
- [3. 方向分类模型推理](#3-方向分类模型推理) - [3. 方向分类模型推理](#3-方向分类模型推理)
- [4. 文本检测、方向分类和文字识别串联推理](#4-文本检测方向分类和文字识别串联推理) - [4. 文本检测、方向分类和文字识别串联推理](#4-文本检测方向分类和文字识别串联推理)
- [5. TensorRT推理](5-TensorRT推理)
<a name="文本检测模型推理"></a> <a name="文本检测模型推理"></a>
...@@ -40,18 +41,17 @@ python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_m ...@@ -40,18 +41,17 @@ python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_m
如果输入图片的分辨率比较大,而且想使用更大的分辨率预测,可以设置det_limit_side_len 为想要的值,比如1216: 如果输入图片的分辨率比较大,而且想使用更大的分辨率预测,可以设置det_limit_side_len 为想要的值,比如1216:
``` ```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --det_limit_type=max --det_limit_side_len=1216 python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --det_limit_type=max --det_limit_side_len=1216
``` ```
如果想使用CPU进行预测,执行命令如下 如果想使用CPU进行预测,执行命令如下
``` ```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_gpu=False python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_gpu=False
``` ```
<a name="文本识别模型推理"></a> <a name="文本识别模型推理"></a>
## 2. 文本识别模型推理 ## 2. 文本识别模型推理
...@@ -163,3 +163,32 @@ python3 tools/infer/predict_system.py --image_dir="./xxx.pdf" --det_model_dir=". ...@@ -163,3 +163,32 @@ python3 tools/infer/predict_system.py --image_dir="./xxx.pdf" --det_model_dir=".
![](../imgs_results/system_res_00018069_v3.jpg) ![](../imgs_results/system_res_00018069_v3.jpg)
更多关于推理超参数的配置与解释,请参考:[模型推理超参数解释教程](./inference_args.md) 更多关于推理超参数的配置与解释,请参考:[模型推理超参数解释教程](./inference_args.md)
## 5. TensorRT推理
Paddle Inference 采用子图的形式集成 TensorRT,针对 GPU 推理场景,TensorRT 可对一些子图进行优化,包括 OP 的横向和纵向融合,过滤冗余的 OP,并为 OP 自动选择最优的 kernel,加快推理速度。
如果希望使用Paddle Inference进行TRT推理,一般需要2个步骤。
* (1)收集该模型关于特定数据集的动态shape信息,并存储到文件中。
* (2)加载动态shape信息文件,进行TRT推理。
以文本检测模型为例,首先使用下面的命令,生成动态shape文件,最终会在`ch_PP-OCRv3_det_infer`目录下面生成`det_trt_dynamic_shape.txt`的文件,该文件即存储了动态shape信息的文件。
```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_tensorrt=True
```
上面的推理过程仅用于收集动态shape信息,没有用TRT进行推理。
运行完成以后,再使用下面的命令,进行TRT推理。
```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_tensorrt=True
```
**注意:**
* 如果在第一步中,已经存在动态shape信息文件,则无需重新收集,直接预测,即使用TRT推理;如果希望重新生成动态shape信息文件,则需要先将模型目录下的动态shape信息文件删掉,再重新生成。
* 动态shape信息文件一般情况下仅需生成一次。在实际部署过程中,建议首先在线下验证集或者测试集合上生成好,之后可以直接加载该文件进行线上TRT推理。
...@@ -12,6 +12,7 @@ This article introduces the use of the Python inference engine for the PP-OCR mo ...@@ -12,6 +12,7 @@ This article introduces the use of the Python inference engine for the PP-OCR mo
- [3. Multilingual Model Inference](#3-multilingual-model-inference) - [3. Multilingual Model Inference](#3-multilingual-model-inference)
- [Angle Classification Model Inference](#angle-classification-model-inference) - [Angle Classification Model Inference](#angle-classification-model-inference)
- [Text Detection Angle Classification and Recognition Inference Concatenation](#text-detection-angle-classification-and-recognition-inference-concatenation) - [Text Detection Angle Classification and Recognition Inference Concatenation](#text-detection-angle-classification-and-recognition-inference-concatenation)
- [TensorRT Inference](TensorRT-Inference)
<a name="DETECTION_MODEL_INFERENCE"></a> <a name="DETECTION_MODEL_INFERENCE"></a>
...@@ -163,3 +164,34 @@ After executing the command, the recognition result image is as follows: ...@@ -163,3 +164,34 @@ After executing the command, the recognition result image is as follows:
![](../imgs_results/system_res_00018069_v3.jpg) ![](../imgs_results/system_res_00018069_v3.jpg)
For more configuration and explanation of inference parameters, please refer to:[Model Inference Parameters Explained Tutorial](./inference_args_en.md) For more configuration and explanation of inference parameters, please refer to:[Model Inference Parameters Explained Tutorial](./inference_args_en.md)
## TensorRT Inference
Paddle Inference ensembles TensorRT using subgraph mode. For GPU deployment scenarios, TensorRT can optimize some subgraphs, including horizontal and vertical integration of OPs, filter redundant OPs, and automatically select the optimal OP kernels for to speed up inference.
You need to do the following 2 steps for inference using TRT.
* (1) Collect the dynamic shape information of the model about a specific dataset and store it in a file.
* (2) Load the dynamic shape information file for TRT inference.
Taking the text detection model as an example. Firstly, you can use the following command to generate a dynamic shape file, which will eventually be named as `det_trt_dynamic_shape.txt` and stored in the `ch_PP-OCRv3_det_infer` folder.
```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_tensorrt=True
```
The above command is only used to collect dynamic shape information, and TRT is not used during inference.
Then, you can use the following command to perform TRT inference.
```bash
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --use_tensorrt=True
```
**Note:**
* In the first step, if the dynamic shape information file already exists, it does not need to be collected again. If you want to regenerate the dynamic shape information file, you need to delete the dynamic shape information file in the model folder firstly, and then regenerate it.
* In general, dynamic shape information file only needs to be generated once. In the actual deployment process, it is recommended that the dynamic shape information file can be generated on offline validation set or test set, and then the file can be directly loaded for online TRT inference.
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册