diff --git a/doc/Serving_Configure_CN.md b/doc/Serving_Configure_CN.md index 9564dcbd51f7e280cd19c13f71885c5b9fcc2064..4288970afdc8df87558ad6b8a01f630b94df63c8 100644 --- a/doc/Serving_Configure_CN.md +++ b/doc/Serving_Configure_CN.md @@ -100,6 +100,7 @@ workdir_9393 | `use_calib` | bool | False | Use TRT int8 calibration | | `gpu_multi_stream` | bool | False | EnableGpuMultiStream to get larger QPS | | `use_ascend_cl` | bool | False | Enable for ascend910; Use with use_lite for ascend310 | +| `request_cache_size` | int | `0` | Bytes size of request cache. By default, the cache is disabled | #### 当您的某个模型想使用多张GPU卡部署时. ```BASH diff --git a/doc/Serving_Configure_EN.md b/doc/Serving_Configure_EN.md index 04c4ad18fb54192bad587feff04635f4e7a1e6d7..68c52cffe690d8a97512adef0a4c073ffa23824b 100644 --- a/doc/Serving_Configure_EN.md +++ b/doc/Serving_Configure_EN.md @@ -100,6 +100,7 @@ More flags: | `use_calib` | bool | False | Use TRT int8 calibration | | `gpu_multi_stream` | bool | False | EnableGpuMultiStream to get larger QPS | | `use_ascend_cl` | bool | False | Enable for ascend910; Use with use_lite for ascend310 | +| `request_cache_size` | int | `0` | Bytes size of request cache. By default, the cache is disabled | #### Serving model with multiple gpus. ```BASH