From 35eab22421277b9eaab0df0d8feaf528ee79f97f Mon Sep 17 00:00:00 2001
From: TeslaZhao
@@ -24,37 +24,36 @@
***
-The goal of Paddle Serving is to provide high-performance, flexible and easy-to-use industrial-grade online inference services for machine learning developers and enterprises.Paddle Serving supports multiple protocols such as RESTful, gRPC, bRPC, and provides inference solutions under a variety of hardware and multiple operating system environments, and many famous pre-trained model examples. The core features are as follows:
+Paddle Serving 依托深度学习框架 PaddlePaddle 旨在帮助深度学习开发者和企业提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案,和多种经典预训练模型示例。核心特性如下:
+- 集成高性能服务端推理引擎 [Paddle Inference](https://paddleinference.paddlepaddle.org.cn/product_introduction/inference_intro.html) 和端侧引擎 [Paddle Lite](https://paddlelite.paddlepaddle.org.cn/introduction/tech_highlights.html),其他机器学习平台(Caffe/TensorFlow/ONNX/PyTorch)可通过 [x2paddle](https://github.com/PaddlePaddle/X2Paddle) 工具迁移模型
+- 具有高性能 C++ Serving 和高易用 Python Pipeline 2套框架。C++ Serving 基于高性能 bRPC 网络框架打造高吞吐、低延迟的推理服务,性能领先竞品。Python Pipeline 基于 gRPC/gRPC-Gateway 网络框架和 Python 语言构建高易用、高吞吐推理服务框架。技术选型参考[技术选型](doc/Serving_Design_CN.md#21-设计选型)
+- 支持 HTTP、gRPC、bRPC 等多种[协议](doc/C++_Serving/Inference_Protocols_CN.md);提供 C++、Python、Java 语言 SDK
+- 设计并实现基于有向无环图(DAG) 的异步流水线高性能推理框架,具有多模型组合、异步调度、并发推理、动态批量、多卡多流推理、请求缓存等特性
+- 适配 x86(Intel) CPU、ARM CPU、Nvidia GPU、昆仑 XPU、华为昇腾310/910、海光 DCU、Nvidia Jetson 等多种硬件
+- 集成 Intel MKLDNN、Nvidia TensorRT 加速库,以及低精度量化推理
+- 提供一套模型安全部署解决方案,包括加密模型部署、鉴权校验、HTTPs 安全网关,并在实际项目中应用
+- 支持云端部署,提供百度云智能云 kubernetes 集群部署 Paddle Serving 案例
+- 提供丰富的经典模型部署示例,如 PaddleOCR、PaddleClas、PaddleDetection、PaddleSeg、PaddleNLP、PaddleRec 等套件,共计40+个预训练精品模型
+- 支持大规模稀疏参数索引模型分布式部署,具有多表、多分片、多副本、本地高频 cache 等特性、可单机或云端部署
+- 支持服务监控,提供基于普罗米修斯的性能数据统计及端口访问
-- Integrate high-performance server-side inference engine [Paddle Inference](https://paddleinference.paddlepaddle.org.cn/product_introduction/inference_intro.html) and mobile-side engine [Paddle Lite](https://paddlelite.paddlepaddle.org.cn/introduction/tech_highlights.html). Models of other machine learning platforms (Caffe/TensorFlow/ONNX/PyTorch) can be migrated to paddle through [x2paddle](https://github.com/PaddlePaddle/X2Paddle).
-- There are two frameworks, namely high-performance C++ Serving and high-easy-to-use Python pipeline. The C++ Serving is based on the bRPC network framework to create a high-throughput, low-latency inference service, and its performance indicators are ahead of competing products. The Python pipeline is based on the gRPC/gRPC-Gateway network framework and the Python language to build a highly easy-to-use and high-throughput inference service. How to choose which one please see [Techinical Selection](doc/Serving_Design_EN.md#21-design-selection).
-- Support multiple [protocols](doc/C++_Serving/Inference_Protocols_CN.md) such as HTTP, gRPC, bRPC, and provide C++, Python, Java language SDK.
-- Design and implement a high-performance inference service framework for asynchronous pipelines based on directed acyclic graph (DAG), with features such as multi-model combination, asynchronous scheduling, concurrent inference, dynamic batch, multi-card multi-stream inference, request cache, etc.
-- Adapt to a variety of commonly used computing hardwares, such as x86 (Intel) CPU, ARM CPU, Nvidia GPU, Kunlun XPU, HUAWEI Ascend 310/910, HYGON DCU、Nvidia Jetson etc.
-- Integrate acceleration libraries of Intel MKLDNN and Nvidia TensorRT, and low-precision and quantitative inference.
-- Provide a model security deployment solution, including encryption model deployment, and authentication mechanism, HTTPs security gateway, which is used in practice.
-- Support cloud deployment, provide a deployment case of Baidu Cloud Intelligent Cloud kubernetes cluster.
-- Provide more than 40 classic pre-model deployment examples, such as PaddleOCR, PaddleClas, PaddleDetection, PaddleSeg, PaddleNLP, PaddleRec and other suites, and more models continue to expand.
-- Supports distributed deployment of large-scale sparse parameter index models, with features such as multiple tables, multiple shards, multiple copies, local high-frequency cache, etc., and can be deployed on a single machine or clouds.
-- Support service monitoring, provide prometheus-based performance statistics and port access
+教程与案例
-Tutorial and Solutions
+- AIStudio 使用教程 : [Paddle Serving服务化部署框架](https://www.paddlepaddle.org.cn/tutorials/projectdetail/3946013)
+- AIStudio OCR 实战 : [基于Paddle Serving的OCR服务化部署实战](https://aistudio.baidu.com/aistudio/projectdetail/3630726)
+- 视频教程 : [深度学习服务化部署-以互联网应用为例](https://aistudio.baidu.com/aistudio/course/introduce/19084)
+- 边缘 AI 解决方案 : [基于Paddle Serving&百度智能边缘BIE的边缘AI解决方案](https://mp.weixin.qq.com/s/j0EVlQXaZ7qmoz9Fv96Yrw)
+- 政务问答解决方案 : [政务问答检索式 FAQ System](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/question_answering/faq_system)
+- 智能问答解决方案 : [保险智能问答](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/question_answering/faq_finance)
+- 语义索引解决方案 : [In-batch Negatives](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/neural_search/recall/in_batch_negative)
-- AIStudio tutorial(Chinese) : [Paddle Serving服务化部署框架](https://www.paddlepaddle.org.cn/tutorials/projectdetail/3946013)
-- AIStudio OCR practice(Chinese) : [基于PaddleServing的OCR服务化部署实战](https://aistudio.baidu.com/aistudio/projectdetail/3630726)
-- Video tutorial(Chinese) : [深度学习服务化部署-以互联网应用为例](https://aistudio.baidu.com/aistudio/course/introduce/19084)
-- Edge AI solution(Chinese) : [基于Paddle Serving&百度智能边缘BIE的边缘AI解决方案](https://mp.weixin.qq.com/s/j0EVlQXaZ7qmoz9Fv96Yrw)
-- GOVT Q&A Solution(Chinese) : [政务问答检索式 FAQ System](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/question_answering/faq_system)
-- Smart Q&A Solution(Chinese) : [保险智能问答](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/question_answering/faq_finance)
-- Semantic Indexing Solution(Chinese) : [In-batch Negatives](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/neural_search/recall/in_batch_negative)
+论文
-Papers
-
-- Paper : [JiZhi: A Fast and Cost-Effective Model-As-A-Service System for
+- 论文 : [JiZhi: A Fast and Cost-Effective Model-As-A-Service System for
Web-Scale Online Inference at Baidu](https://arxiv.org/pdf/2106.01674.pdf)
-- Paper : [ERNIE 3.0 TITAN: EXPLORING LARGER-SCALE KNOWLEDGE
+- 论文 : [ERNIE 3.0 TITAN: EXPLORING LARGER-SCALE KNOWLEDGE
ENHANCED PRE-TRAINING FOR LANGUAGE UNDERSTANDING
AND GENERATION](https://arxiv.org/pdf/2112.12731.pdf)
@@ -62,118 +61,115 @@ AND GENERATION](https://arxiv.org/pdf/2112.12731.pdf)
| PaddleOCR | PaddleDetection | PaddleClas | PaddleSeg | PaddleRec | Paddle NLP | Paddle Video | | :----: | :----: | :----: | :----: | :----: | :----: | :----: | -| 8 | 12 | 14 | 2 | 3 | 6 | 1| +| 8 | 12 | 14 | 2 | 3 | 7 | 1 |
-For more model examples, read [Model zoo](doc/Model_Zoo_EN.md) +更多模型示例进入[模型库](doc/Model_Zoo_CN.md)- +
+### QQ -- QQ Group(Group No.:697765514) +- 飞桨推理部署交流群(Group No.:697765514)
-> Contribution +> 贡献代码 -If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/Contribute_EN.md) -- Thanks to [@loveululu](https://github.com/loveululu) for providing python API of Cube. -- Thanks to [@EtachGu](https://github.com/EtachGu) in updating run docker codes. -- Thanks to [@BeyondYourself](https://github.com/BeyondYourself) in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command -- Thanks to [@mcl-stone](https://github.com/mcl-stone) in updating faster_rcnn benchmark -- Thanks to [@cg82616424](https://github.com/cg82616424) in updating the unet benchmark modifying resize comment error -- Thanks to [@cuicheng01](https://github.com/cuicheng01) for providing 11 PaddleClas models -- Thanks to [@Jiaqi Liu](https://github.com/LiuChiachi) for supporting prediction for string list input -- Thanks to [@Bin Lu](https://github.com/Intsigstephon) for adding pp-shitu example +如果您想为Paddle Serving贡献代码,请参考 [Contribution Guidelines(English)](doc/Contribute_EN.md) +- 感谢 [@w5688414](https://github.com/w5688414) 提供 NLP Ernie Indexing 案例 +- 感谢 [@loveululu](https://github.com/loveululu) 提供 Cube python API +- 感谢 [@EtachGu](https://github.com/EtachGu) 更新 docker 使用命令 +- 感谢 [@BeyondYourself](https://github.com/BeyondYourself) 提供grpc教程,更新FAQ教程,整理文件目录。 +- 感谢 [@mcl-stone](https://github.com/mcl-stone) 提供faster rcnn benchmark脚本 +- 感谢 [@cg82616424](https://github.com/cg82616424) 提供unet benchmark脚本和修改部分注释错误 +- 感谢 [@cuicheng01](https://github.com/cuicheng01) 提供PaddleClas的11个模型 +- 感谢 [@Jiaqi Liu](https://github.com/LiuChiachi) 新增list[str]类型输入的预测支持 +- 感谢 [@Bin Lu](https://github.com/Intsigstephon) 提供PP-Shitu C++模型示例 -> Feedback +> 反馈 -For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues). +如有任何反馈或是bug,请在 [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues)提交 > License -[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE) +[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE) \ No newline at end of file diff --git a/README_CN.md b/README_CN.md index ac26551d..0e9d0da5 100755 --- a/README_CN.md +++ b/README_CN.md @@ -1,4 +1,4 @@ -(简体中文|[English](./README.md)) +([简体中文](./README_CN.md)|English)
@@ -24,36 +24,37 @@
***
-Paddle Serving 依托深度学习框架 PaddlePaddle 旨在帮助深度学习开发者和企业提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案,和多种经典预训练模型示例。核心特性如下:
+The goal of Paddle Serving is to provide high-performance, flexible and easy-to-use industrial-grade online inference services for machine learning developers and enterprises.Paddle Serving supports multiple protocols such as RESTful, gRPC, bRPC, and provides inference solutions under a variety of hardware and multiple operating system environments, and many famous pre-trained model examples. The core features are as follows:
-- 集成高性能服务端推理引擎 [Paddle Inference](https://paddleinference.paddlepaddle.org.cn/product_introduction/inference_intro.html) 和端侧引擎 [Paddle Lite](https://paddlelite.paddlepaddle.org.cn/introduction/tech_highlights.html),其他机器学习平台(Caffe/TensorFlow/ONNX/PyTorch)可通过 [x2paddle](https://github.com/PaddlePaddle/X2Paddle) 工具迁移模型
-- 具有高性能 C++ Serving 和高易用 Python Pipeline 2套框架。C++ Serving 基于高性能 bRPC 网络框架打造高吞吐、低延迟的推理服务,性能领先竞品。Python Pipeline 基于 gRPC/gRPC-Gateway 网络框架和 Python 语言构建高易用、高吞吐推理服务框架。技术选型参考[技术选型](doc/Serving_Design_CN.md#21-设计选型)
-- 支持 HTTP、gRPC、bRPC 等多种[协议](doc/C++_Serving/Inference_Protocols_CN.md);提供 C++、Python、Java 语言 SDK
-- 设计并实现基于有向无环图(DAG) 的异步流水线高性能推理框架,具有多模型组合、异步调度、并发推理、动态批量、多卡多流推理、请求缓存等特性
-- 适配 x86(Intel) CPU、ARM CPU、Nvidia GPU、昆仑 XPU、华为昇腾310/910、海光 DCU、Nvidia Jetson 等多种硬件
-- 集成 Intel MKLDNN、Nvidia TensorRT 加速库,以及低精度量化推理
-- 提供一套模型安全部署解决方案,包括加密模型部署、鉴权校验、HTTPs 安全网关,并在实际项目中应用
-- 支持云端部署,提供百度云智能云 kubernetes 集群部署 Paddle Serving 案例
-- 提供丰富的经典模型部署示例,如 PaddleOCR、PaddleClas、PaddleDetection、PaddleSeg、PaddleNLP、PaddleRec 等套件,共计40+个预训练精品模型
-- 支持大规模稀疏参数索引模型分布式部署,具有多表、多分片、多副本、本地高频 cache 等特性、可单机或云端部署
-- 支持服务监控,提供基于普罗米修斯的性能数据统计及端口访问
+- Integrate high-performance server-side inference engine [Paddle Inference](https://paddleinference.paddlepaddle.org.cn/product_introduction/inference_intro.html) and mobile-side engine [Paddle Lite](https://paddlelite.paddlepaddle.org.cn/introduction/tech_highlights.html). Models of other machine learning platforms (Caffe/TensorFlow/ONNX/PyTorch) can be migrated to paddle through [x2paddle](https://github.com/PaddlePaddle/X2Paddle).
+- There are two frameworks, namely high-performance C++ Serving and high-easy-to-use Python pipeline. The C++ Serving is based on the bRPC network framework to create a high-throughput, low-latency inference service, and its performance indicators are ahead of competing products. The Python pipeline is based on the gRPC/gRPC-Gateway network framework and the Python language to build a highly easy-to-use and high-throughput inference service. How to choose which one please see [Techinical Selection](doc/Serving_Design_EN.md#21-design-selection).
+- Support multiple [protocols](doc/C++_Serving/Inference_Protocols_CN.md) such as HTTP, gRPC, bRPC, and provide C++, Python, Java language SDK.
+- Design and implement a high-performance inference service framework for asynchronous pipelines based on directed acyclic graph (DAG), with features such as multi-model combination, asynchronous scheduling, concurrent inference, dynamic batch, multi-card multi-stream inference, request cache, etc.
+- Adapt to a variety of commonly used computing hardwares, such as x86 (Intel) CPU, ARM CPU, Nvidia GPU, Kunlun XPU, HUAWEI Ascend 310/910, HYGON DCU、Nvidia Jetson etc.
+- Integrate acceleration libraries of Intel MKLDNN and Nvidia TensorRT, and low-precision and quantitative inference.
+- Provide a model security deployment solution, including encryption model deployment, and authentication mechanism, HTTPs security gateway, which is used in practice.
+- Support cloud deployment, provide a deployment case of Baidu Cloud Intelligent Cloud kubernetes cluster.
+- Provide more than 40 classic pre-model deployment examples, such as PaddleOCR, PaddleClas, PaddleDetection, PaddleSeg, PaddleNLP, PaddleRec and other suites, and more models continue to expand.
+- Supports distributed deployment of large-scale sparse parameter index models, with features such as multiple tables, multiple shards, multiple copies, local high-frequency cache, etc., and can be deployed on a single machine or clouds.
+- Support service monitoring, provide prometheus-based performance statistics and port access
-
| PaddleOCR | PaddleDetection | PaddleClas | PaddleSeg | PaddleRec | Paddle NLP | Paddle Video | | :----: | :----: | :----: | :----: | :----: | :----: | :----: | -| 8 | 12 | 14 | 2 | 3 | 7 | 1 | +| 8 | 12 | 14 | 2 | 3 | 6 | 1|
-更多模型示例进入[模型库](doc/Model_Zoo_CN.md) +For more model examples, read [Model zoo](doc/Model_Zoo_EN.md)- +
-### QQ -- 飞桨推理部署交流群(Group No.:697765514) +- QQ Group(Group No.:697765514)
-> 贡献代码 +> Contribution -如果您想为Paddle Serving贡献代码,请参考 [Contribution Guidelines(English)](doc/Contribute_EN.md) -- 感谢 [@w5688414](https://github.com/w5688414) 提供 NLP Ernie Indexing 案例 -- 感谢 [@loveululu](https://github.com/loveululu) 提供 Cube python API -- 感谢 [@EtachGu](https://github.com/EtachGu) 更新 docker 使用命令 -- 感谢 [@BeyondYourself](https://github.com/BeyondYourself) 提供grpc教程,更新FAQ教程,整理文件目录。 -- 感谢 [@mcl-stone](https://github.com/mcl-stone) 提供faster rcnn benchmark脚本 -- 感谢 [@cg82616424](https://github.com/cg82616424) 提供unet benchmark脚本和修改部分注释错误 -- 感谢 [@cuicheng01](https://github.com/cuicheng01) 提供PaddleClas的11个模型 -- 感谢 [@Jiaqi Liu](https://github.com/LiuChiachi) 新增list[str]类型输入的预测支持 -- 感谢 [@Bin Lu](https://github.com/Intsigstephon) 提供PP-Shitu C++模型示例 +If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/Contribute_EN.md) +- Thanks to [@loveululu](https://github.com/loveululu) for providing python API of Cube. +- Thanks to [@EtachGu](https://github.com/EtachGu) in updating run docker codes. +- Thanks to [@BeyondYourself](https://github.com/BeyondYourself) in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command +- Thanks to [@mcl-stone](https://github.com/mcl-stone) in updating faster_rcnn benchmark +- Thanks to [@cg82616424](https://github.com/cg82616424) in updating the unet benchmark modifying resize comment error +- Thanks to [@cuicheng01](https://github.com/cuicheng01) for providing 11 PaddleClas models +- Thanks to [@Jiaqi Liu](https://github.com/LiuChiachi) for supporting prediction for string list input +- Thanks to [@Bin Lu](https://github.com/Intsigstephon) for adding pp-shitu example -> 反馈 +> Feedback -如有任何反馈或是bug,请在 [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues)提交 +For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues). > License diff --git a/doc/FAQ_CN.md b/doc/FAQ_CN.md index 59455a77..159de7c0 100644 --- a/doc/FAQ_CN.md +++ b/doc/FAQ_CN.md @@ -30,7 +30,7 @@ Failed to predict: (data_id=1 log_id=0) [det|0] Failed to postprocess: postproce #### Q: Paddle Serving 支持哪些数据类型? -**A:** 在 protobuf 定义中 `feed_type` 和 `fetch_type` 编号与数据类型对应如下,完整信息可参考[保存用于 Serving 部署的模型参数](./5-1_Save_Model_Params_CN.md) +**A:** 在 protobuf 定义中 `feed_type` 和 `fetch_type` 编号与数据类型对应如下,完整信息可参考[保存用于 Serving 部署的模型参数](./Save_CN.md) | 类型 | 类型值 | |------|------| @@ -49,7 +49,7 @@ Failed to predict: (data_id=1 log_id=0) [det|0] Failed to postprocess: postproce #### Q: Paddle Serving 是否支持 Windows 和 Linux 原生环境部署? -**A:** 安装 `Linux Docker`,在 Docker 中部署 Paddle Serving,参考[安装指南](./2-0_Index_CN.md) +**A:** 安装 `Linux Docker`,在 Docker 中部署 Paddle Serving,参考[安装指南](./Install_CN.md) #### Q: Paddle Serving 如何修改消息大小限制 @@ -61,7 +61,7 @@ Failed to predict: (data_id=1 log_id=0) [det|0] Failed to postprocess: postproce #### Q: Paddle Serving 支持哪些网络协议? -**A:** C++ Serving 同时支持 HTTP、gRPC 和 bRPC 协议。其中 HTTP 协议既支持 HTTP + Json 格式,同时支持 HTTP + proto 格式。完整信息请阅读[C++ Serving 通讯协议](./6-2_Cpp_Serving_Protocols_CN.md);Python Pipeline 支持 HTTP 和 gRPC 协议,更多信息请阅读[Python Pipeline 框架设计](./6-2_Cpp_Serving_Protocols_CN.md) +**A:** C++ Serving 同时支持 HTTP、gRPC 和 bRPC 协议。其中 HTTP 协议既支持 HTTP + Json 格式,同时支持 HTTP + proto 格式。完整信息请阅读[C++ Serving 通讯协议](./C++_Serving/Inference_Protocols_CN.md);Python Pipeline 支持 HTTP 和 gRPC 协议,更多信息请阅读[Python Pipeline 框架设计](./Python_Pipeline/Pipeline_Features_CN.md) @@ -309,7 +309,7 @@ GLOG_v=2 python -m paddle_serving_server.serve --model xxx_conf/ --port 9999 #### Q: Python Pipeline 启动成功后,日志文件在哪里,在哪里设置日志级别? -**A:** Python Pipeline 服务的日志信息请阅读[Python Pipeline 设计](./7-1_Python_Pipeline_Design_CN.md) 第三节服务日志。 +**A:** Python Pipeline 服务的日志信息请阅读[Python Pipeline 设计](./Python_Pipeline/Pipeline_Design_CN.md) 第三节服务日志。 #### Q: (GLOG_v=2下)Server 日志一切正常,但 Client 始终得不到正确的预测结果 diff --git a/doc/Low_Precision_CN.md b/doc/Low_Precision_CN.md index f9de4300..387e525c 100644 --- a/doc/Low_Precision_CN.md +++ b/doc/Low_Precision_CN.md @@ -4,6 +4,8 @@ 低精度部署, 在Intel CPU上支持int8、bfloat16模型,Nvidia TensorRT支持int8、float16模型。 +## C++ Serving 部署量化模型 + ### 通过PaddleSlim量化生成低精度模型 详细见[PaddleSlim量化](https://paddleslim.readthedocs.io/zh_CN/latest/tutorials/quant/overview.html) @@ -41,7 +43,12 @@ fetch_map = client.predict(feed={"image": img}, fetch=["score"]) print(fetch_map["score"].reshape(-1)) ``` -### 参考文档 +## Python Pipeline 部署量化模型 + +请参考 [Python Pipeline 低精度推理](./Python_Pipeline/Pipeline_Features_CN.md#低精度推理) + + +## 参考文档 * [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) * PaddleInference Intel CPU部署量化模型[文档](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_x86_cpu_int8.html) * PaddleInference NV GPU部署量化模型[文档](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_trt.html) -- GitLab