diff --git a/README.md b/README.md index 3d50443a5bbacc7a709f43fbb5bb1cc504cf6fee..9b1b20236f190dc5d7731c01d1b7458e520a7835 100755 --- a/README.md +++ b/README.md @@ -62,7 +62,6 @@ This chapter guides you through the installation and deployment steps. It is str - [Deploy Paddle Serving on Kubernetes(Chinese)](doc/Run_On_Kubernetes_CN.md) - [Deploy Paddle Serving with Security gateway(Chinese)](doc/Serving_Auth_Docker_CN.md) - Deploy on more hardwares[[百度昆仑](doc/Run_On_XPU_CN.md)、[华为昇腾](doc/Run_On_NPU_CN.md)、[海光DCU](doc/Run_On_DCU_CN.md)、[Jetson](doc/Run_On_JETSON_CN.md)] -- [Docker镜像](doc/Docker_Images_CN.md) - [Docker Images](doc/Docker_Images_EN.md) - [Latest Wheel packages](doc/Latest_Packages_CN.md) @@ -86,6 +85,7 @@ The first step is to call the model save interface to generate a model parameter - [Multiple models in series(Chinese)](doc/C++_Serving/2+_model.md) - [Python Pipeline](doc/Python_Pipeline/Pipeline_Design_EN.md) - [Analyze and optimize performance](doc/Python_Pipeline/Performance_Tuning_EN.md) + - [TensorRT dynamic Shape](doc/TensorRT_Dynamic_Shape_EN.md) - [Benchmark(Chinese)](doc/Python_Pipeline/Benchmark_CN.md) - Client SDK - [Python SDK(Chinese)](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client) diff --git a/README_CN.md b/README_CN.md index 252cae31644dfc3d276dc504456f6c44df1292c8..24d2d104c0239ed3bf5c4b8d0a49f01657392d94 100755 --- a/README_CN.md +++ b/README_CN.md @@ -80,6 +80,7 @@ Paddle Serving依托深度学习框架PaddlePaddle旨在帮助深度学习开发 - [多模型串联](doc/C++_Serving/2+_model.md) - [Python Pipeline设计](doc/Python_Pipeline/Pipeline_Design_CN.md) - [性能优化指南](doc/Python_Pipeline/Performance_Tuning_CN.md) + - [TensorRT动态shape](doc/TensorRT_Dynamic_Shape_CN.md) - [性能指标](doc/Python_Pipeline/Benchmark_CN.md) - 客户端SDK - [Python SDK](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client) diff --git a/doc/Quick_Start_CN.md b/doc/Quick_Start_CN.md index 0027ffd80c33bb0a2f5317e08d0846ad0a02700f..4ce11e7b3ec5f61ef46f83c936ffadf612ee764b 100644 --- a/doc/Quick_Start_CN.md +++ b/doc/Quick_Start_CN.md @@ -23,25 +23,8 @@ Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务 ``` shell python3 -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 ``` -
- -| Argument | Type | Default | Description | -| ---------------------------------------------- | ---- | ------- | ----------------------------------------------------- | -| `thread` | int | `2` | Number of brpc service thread | -| `op_num` | int[]| `0` | Thread Number for each model in asynchronous mode | -| `op_max_batch` | int[]| `32` | Batch Number for each model in asynchronous mode | -| `gpu_ids` | str[]| `"-1"` | Gpu card id for each model | -| `port` | int | `9292` | Exposed port of current service to users | -| `model` | str[]| `""` | Path of paddle model directory to be served | -| `mem_optim_off` | - | - | Disable memory / graphic memory optimization | -| `ir_optim` | bool | False | Enable analysis and optimization of calculation graph | -| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL | -| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT | -| `use_lite` (Only for Intel x86 CPU or ARM CPU) | - | - | Run PaddleLite inference | -| `use_xpu` | - | - | Run PaddleLite inference with Baidu Kunlun XPU | -| `precision` | str | FP32 | Precision Mode, support FP32, FP16, INT8 | -| `use_calib` | bool | False | Use TRT int8 calibration | -| `gpu_multi_stream` | bool | False | EnableGpuMultiStream to get larger QPS | + +完整参数列表参阅文档[Serving配置](doc/Serving_Configure_EN.md#c-serving) #### 异步模型的说明 异步模式适用于1、请求数量非常大的情况,2、多模型串联,想要分别指定每个模型的并发数的情况。 diff --git a/doc/Quick_Start_EN.md b/doc/Quick_Start_EN.md index 36818bb233bde3d5484a909dcc88b9fc7c73c504..9180bccb750cb50c565d795d4ff212ba76554d45 100644 --- a/doc/Quick_Start_EN.md +++ b/doc/Quick_Start_EN.md @@ -20,23 +20,8 @@ A user can also start a RPC service with `paddle_serving_server.serve`. RPC serv ``` shell python3 -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 ``` -
- -| Argument | Type | Default | Description | -| ---------------------------------------------- | ---- | ------- | ----------------------------------------------------- | -| `thread` | int | `4` | Concurrency of current service | -| `port` | int | `9292` | Exposed port of current service to users | -| `model` | str | `""` | Path of paddle model directory to be served | -| `mem_optim_off` | - | - | Disable memory / graphic memory optimization | -| `ir_optim` | bool | False | Enable analysis and optimization of calculation graph | -| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL | -| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT | -| `use_lite` (Only for Intel x86 CPU or ARM CPU) | - | - | Run PaddleLite inference | -| `use_xpu` | - | - | Run PaddleLite inference with Baidu Kunlun XPU | -| `precision` | str | FP32 | Precision Mode, support FP32, FP16, INT8 | -| `use_calib` | bool | False | Only for deployment with TensorRT | - -
+ +For a complete list of parameters, see the document [Serving Configuration](doc/Serving_Configure_CN.md#c-serving) ```python # A user can visit rpc service through paddle_serving_client API diff --git a/doc/Serving_Configure_CN.md b/doc/Serving_Configure_CN.md index 5b42221a894de54c4c46e23c254f62d464c9bc4f..c3b20a8a7aa6baa8d9ab5172bb574f95b90ef45d 100644 --- a/doc/Serving_Configure_CN.md +++ b/doc/Serving_Configure_CN.md @@ -1,4 +1,4 @@ -# Serving Configuration +# Serving配置 (简体中文|[English](./Serving_Configure_EN.md))