diff --git a/doc/Serving_Configure_CN.md b/doc/Serving_Configure_CN.md
index 9564dcbd51f7e280cd19c13f71885c5b9fcc2064..4288970afdc8df87558ad6b8a01f630b94df63c8 100644
--- a/doc/Serving_Configure_CN.md
+++ b/doc/Serving_Configure_CN.md
@@ -100,6 +100,7 @@ workdir_9393
 | `use_calib`                                    | bool | False   | Use TRT int8 calibration                              |
 | `gpu_multi_stream`                             | bool | False   | EnableGpuMultiStream to get larger QPS                |
 | `use_ascend_cl`                                | bool | False   | Enable for ascend910; Use with use_lite for ascend310 |
+| `request_cache_size`                           | int  | `0`     | Bytes size of request cache. By default, the cache is disabled |
 
 #### 当您的某个模型想使用多张GPU卡部署时.
 ```BASH
diff --git a/doc/Serving_Configure_EN.md b/doc/Serving_Configure_EN.md
index 04c4ad18fb54192bad587feff04635f4e7a1e6d7..68c52cffe690d8a97512adef0a4c073ffa23824b 100644
--- a/doc/Serving_Configure_EN.md
+++ b/doc/Serving_Configure_EN.md
@@ -100,6 +100,7 @@ More flags:
 | `use_calib`                                    | bool | False   | Use TRT int8 calibration                              |
 | `gpu_multi_stream`                             | bool | False   | EnableGpuMultiStream to get larger QPS                |
 | `use_ascend_cl`                                | bool | False   | Enable for ascend910; Use with use_lite for ascend310 |
+| `request_cache_size`                           | int  | `0`     | Bytes size of request cache. By default, the cache is disabled |
 
 #### Serving model with multiple gpus.
 ```BASH