diff --git a/ppstructure/vqa/README-en.md b/ppstructure/vqa/README-en.md
index 1a03e9f5153abac082c864dc2c1197bc30c66aec..6db0965f1f0901c3579bebefb96032c6eee9442a 100644
--- a/ppstructure/vqa/README-en.md
+++ b/ppstructure/vqa/README-en.md
@@ -18,7 +18,7 @@ The main features are as follows:
 **Note**: This project is based on the open source implementation of  [LayoutXLM](https://arxiv.org/pdf/2104.08836.pdf) on Paddle 2.2, and at the same time, after in-depth polishing by the flying Paddle team and the Industrial and **Commercial Bank of China** in the scene of real estate certificate, jointly open source.
 
 
-## 1 .Performance
+## 1.Performance
 
 We evaluated the algorithm on  [XFUN](https://github.com/doc-analysis/XFUND) 's Chinese data set, and the performance is as follows
 
@@ -105,14 +105,14 @@ pip3 install -e .
 ```
 
 
-- **（4）Requirements for installing VQA`**
+- **（4）Install requirements for VQA **
 
 ```bash
 cd ppstructure/vqa
 pip install -r requirements.txt
 ```
 
-## 4. Use
+## 4. Usage
 
 
 ### 4.1 Data and pre training model preparation
@@ -216,7 +216,7 @@ python3.7 infer_ser_e2e.py \
     --infer_imgs "images/input/zh_val_0.jpg"
 ```
 
-* End to end evaluation of OCR engine + SER prediction system
+* End-to-end evaluation of OCR engine + SER prediction system
 
 ```shell
 export CUDA_VISIBLE_DEVICES=0
@@ -250,7 +250,7 @@ python3 train_re.py \
 
 ```
 
-* Recovery training
+* Resume training
 
 ```shell
 export CUDA_VISIBLE_DEVICES=0
@@ -324,8 +324,8 @@ python3.7 infer_ser_re_e2e.py \
     --infer_imgs "images/input/zh_val_21.jpg"
 ```
 
-## Reference Link
+## Reference
 
 - LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding, https://arxiv.org/pdf/2104.08836.pdf
 - microsoft/unilm/layoutxlm, https://github.com/microsoft/unilm/tree/master/layoutxlm
-- XFUND dataset, https://github.com/doc-analysis/XFUND
\ No newline at end of file
+- XFUND dataset, https://github.com/doc-analysis/XFUND