未验证 提交 5e54aa9b 编写于 作者: T TeslaZhao 提交者: GitHub

Merge pull request #1650 from TeslaZhao/v0.8.0

Merge pull request #1649 from TeslaZhao/develop
......@@ -62,7 +62,6 @@ This chapter guides you through the installation and deployment steps. It is str
- [Deploy Paddle Serving on Kubernetes(Chinese)](doc/Run_On_Kubernetes_CN.md)
- [Deploy Paddle Serving with Security gateway(Chinese)](doc/Serving_Auth_Docker_CN.md)
- Deploy on more hardwares[[百度昆仑](doc/Run_On_XPU_CN.md)[华为昇腾](doc/Run_On_NPU_CN.md)[海光DCU](doc/Run_On_DCU_CN.md)[Jetson](doc/Run_On_JETSON_CN.md)]
- [Docker镜像](doc/Docker_Images_CN.md)
- [Docker Images](doc/Docker_Images_EN.md)
- [Latest Wheel packages](doc/Latest_Packages_CN.md)
......@@ -86,6 +85,7 @@ The first step is to call the model save interface to generate a model parameter
- [Multiple models in series(Chinese)](doc/C++_Serving/2+_model.md)
- [Python Pipeline](doc/Python_Pipeline/Pipeline_Design_EN.md)
- [Analyze and optimize performance](doc/Python_Pipeline/Performance_Tuning_EN.md)
- [TensorRT dynamic Shape](doc/TensorRT_Dynamic_Shape_EN.md)
- [Benchmark(Chinese)](doc/Python_Pipeline/Benchmark_CN.md)
- Client SDK
- [Python SDK(Chinese)](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client)
......
......@@ -80,6 +80,7 @@ Paddle Serving依托深度学习框架PaddlePaddle旨在帮助深度学习开发
- [多模型串联](doc/C++_Serving/2+_model.md)
- [Python Pipeline设计](doc/Python_Pipeline/Pipeline_Design_CN.md)
- [性能优化指南](doc/Python_Pipeline/Performance_Tuning_CN.md)
- [TensorRT动态shape](doc/TensorRT_Dynamic_Shape_CN.md)
- [性能指标](doc/Python_Pipeline/Benchmark_CN.md)
- 客户端SDK
- [Python SDK](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client)
......
......@@ -23,25 +23,8 @@ Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务
``` shell
python3 -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
<center>
| Argument | Type | Default | Description |
| ---------------------------------------------- | ---- | ------- | ----------------------------------------------------- |
| `thread` | int | `2` | Number of brpc service thread |
| `op_num` | int[]| `0` | Thread Number for each model in asynchronous mode |
| `op_max_batch` | int[]| `32` | Batch Number for each model in asynchronous mode |
| `gpu_ids` | str[]| `"-1"` | Gpu card id for each model |
| `port` | int | `9292` | Exposed port of current service to users |
| `model` | str[]| `""` | Path of paddle model directory to be served |
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
| `ir_optim` | bool | False | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT |
| `use_lite` (Only for Intel x86 CPU or ARM CPU) | - | - | Run PaddleLite inference |
| `use_xpu` | - | - | Run PaddleLite inference with Baidu Kunlun XPU |
| `precision` | str | FP32 | Precision Mode, support FP32, FP16, INT8 |
| `use_calib` | bool | False | Use TRT int8 calibration |
| `gpu_multi_stream` | bool | False | EnableGpuMultiStream to get larger QPS |
完整参数列表参阅文档[Serving配置](doc/Serving_Configure_EN.md#c-serving)
#### 异步模型的说明
异步模式适用于1、请求数量非常大的情况,2、多模型串联,想要分别指定每个模型的并发数的情况。
......
......@@ -20,23 +20,8 @@ A user can also start a RPC service with `paddle_serving_server.serve`. RPC serv
``` shell
python3 -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
<center>
| Argument | Type | Default | Description |
| ---------------------------------------------- | ---- | ------- | ----------------------------------------------------- |
| `thread` | int | `4` | Concurrency of current service |
| `port` | int | `9292` | Exposed port of current service to users |
| `model` | str | `""` | Path of paddle model directory to be served |
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
| `ir_optim` | bool | False | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT |
| `use_lite` (Only for Intel x86 CPU or ARM CPU) | - | - | Run PaddleLite inference |
| `use_xpu` | - | - | Run PaddleLite inference with Baidu Kunlun XPU |
| `precision` | str | FP32 | Precision Mode, support FP32, FP16, INT8 |
| `use_calib` | bool | False | Only for deployment with TensorRT |
</center>
For a complete list of parameters, see the document [Serving Configuration](doc/Serving_Configure_CN.md#c-serving)
```python
# A user can visit rpc service through paddle_serving_client API
......
# Serving Configuration
# Serving配置
(简体中文|[English](./Serving_Configure_EN.md))
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册