diff --git a/doc/doc_ch/dataset/ocr_datasets.md b/doc/doc_ch/dataset/ocr_datasets.md
index 828e35aa11e42af064a73de7501050dbcefe3a32..c6ff2e170f7c30a29e98ed2b1349cae2b84cf441 100644
--- a/doc/doc_ch/dataset/ocr_datasets.md
+++ b/doc/doc_ch/dataset/ocr_datasets.md
@@ -1,15 +1,14 @@
# OCR数据集
-- [OCR数据集](#ocr数据集)
- - [1. 文本检测](#1-文本检测)
- - [1.1 PaddleOCR 文字检测数据格式](#11-paddleocr-文字检测数据格式)
- - [1.2 公开数据集](#12-公开数据集)
- - [1.2.1 ICDAR 2015](#121-icdar-2015)
- - [2. 文本识别](#2-文本识别)
- - [2.1 PaddleOCR 文字识别数据格式](#21-paddleocr-文字识别数据格式)
- - [2.2 公开数据集](#22-公开数据集)
- - [2.1 ICDAR 2015](#21-icdar-2015)
- - [3. 数据存放路径](#3-数据存放路径)
+- [1. 文本检测](#1-文本检测)
+ - [1.1 PaddleOCR 文字检测数据格式](#11-paddleocr-文字检测数据格式)
+ - [1.2 公开数据集](#12-公开数据集)
+ - [1.2.1 ICDAR 2015](#121-icdar-2015)
+- [2. 文本识别](#2-文本识别)
+ - [2.1 PaddleOCR 文字识别数据格式](#21-paddleocr-文字识别数据格式)
+ - [2.2 公开数据集](#22-公开数据集)
+ - [2.1 ICDAR 2015](#21-icdar-2015)
+- [3. 数据存放路径](#3-数据存放路径)
这里整理了OCR中常用的公开数据集,持续更新中,欢迎各位小伙伴贡献数据集~
diff --git a/doc/doc_ch/dataset/table_datasets.md b/doc/doc_ch/dataset/table_datasets.md
index b40482ae316a6dcad6a2db811d31ba0ba7f00ec3..ae902b23ccf985d522386b7454c7f76a74917502 100644
--- a/doc/doc_ch/dataset/table_datasets.md
+++ b/doc/doc_ch/dataset/table_datasets.md
@@ -1,9 +1,8 @@
# 表格识别数据集
-- [表格识别数据集](#表格识别数据集)
- - [数据集汇总](#数据集汇总)
- - [1. PubTabNet数据集](#1-pubtabnet数据集)
- - [2. 好未来表格识别竞赛数据集](#2-好未来表格识别竞赛数据集)
+- [数据集汇总](#数据集汇总)
+- [1. PubTabNet数据集](#1-pubtabnet数据集)
+- [2. 好未来表格识别竞赛数据集](#2-好未来表格识别竞赛数据集)
这里整理了常用表格识别数据集,持续更新中,欢迎各位小伙伴贡献数据集~
diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md
index 27e1ef16e93b8ad57ccd33ff318bad255faedb6d..a915bc60fb9613f4c80e9bc0ae4bdfa3d630f052 100644
--- a/doc/doc_ch/detection.md
+++ b/doc/doc_ch/detection.md
@@ -2,7 +2,6 @@
本节以icdar2015数据集为例,介绍PaddleOCR中检测模型训练、评估、测试的使用方式。
-- [文字检测](#文字检测)
- [1. 准备数据和模型](#1-准备数据和模型)
- [1.1 准备数据集](#11-准备数据集)
- [1.2 下载预训练模型](#12-下载预训练模型)
diff --git a/doc/doc_ch/recognition.md b/doc/doc_ch/recognition.md
index 6b1a8bf5dbbe414ab06d9444f8b779bea9fe464d..4a1fe7ae956c187a7293d5ace99b70f499f15fd4 100644
--- a/doc/doc_ch/recognition.md
+++ b/doc/doc_ch/recognition.md
@@ -2,19 +2,18 @@
本文提供了PaddleOCR文本识别任务的全流程指南,包括数据准备、模型训练、调优、评估、预测,各个阶段的详细说明:
-- [文字识别](#文字识别)
- - [1. 数据准备](#1-数据准备)
- - [1.1 准备数据集](#11-准备数据集)
- - [1.2 字典](#12-字典)
- - [1.3 添加空格类别](#13-添加空格类别)
- - [2. 启动训练](#2-启动训练)
- - [2.1 数据增强](#21-数据增强)
- - [2.2 通用模型训练](#22-通用模型训练)
- - [2.3 多语言模型训练](#23-多语言模型训练)
- - [2.4 知识蒸馏训练](#24-知识蒸馏训练)
- - [3 评估](#3-评估)
- - [4 预测](#4-预测)
- - [5. 转Inference模型测试](#5-转inference模型测试)
+- [1. 数据准备](#1-数据准备)
+ - [1.1 准备数据集](#11-准备数据集)
+ - [1.2 字典](#12-字典)
+ - [1.3 添加空格类别](#13-添加空格类别)
+- [2. 启动训练](#2-启动训练)
+ - [2.1 数据增强](#21-数据增强)
+ - [2.2 通用模型训练](#22-通用模型训练)
+ - [2.3 多语言模型训练](#23-多语言模型训练)
+ - [2.4 知识蒸馏训练](#24-知识蒸馏训练)
+- [3 评估](#3-评估)
+- [4 预测](#4-预测)
+- [5. 转Inference模型测试](#5-转inference模型测试)
diff --git a/doc/doc_en/dataset/ocr_datasets_en.md b/doc/doc_en/dataset/ocr_datasets_en.md
index 140b7c76b09ac12d52e62572c141fb54a32504c6..c05fb87d5a3ed61ad38c97b84321819e8981d436 100644
--- a/doc/doc_en/dataset/ocr_datasets_en.md
+++ b/doc/doc_en/dataset/ocr_datasets_en.md
@@ -1,15 +1,14 @@
# OCR datasets
-- [OCR datasets](#ocr-datasets)
- - [1. Text detection](#1-text-detection)
- - [1.1 PaddleOCR text detection format annotation](#11-paddleocr-text-detection-format-annotation)
- - [1.2 Public dataset](#12-public-dataset)
- - [1.2.1 ICDAR 2015](#121-icdar-2015)
- - [2. Text recognition](#2-text-recognition)
- - [2.1 PaddleOCR text recognition format annotation](#21-paddleocr-text-recognition-format-annotation)
- - [2.2 Public dataset](#22-public-dataset)
- - [2.1 ICDAR2015](#21-icdar2015)
- - [3. 数据存放路径](#3-数据存放路径)
+- [1. Text detection](#1-text-detection)
+ - [1.1 PaddleOCR text detection format annotation](#11-paddleocr-text-detection-format-annotation)
+ - [1.2 Public dataset](#12-public-dataset)
+ - [1.2.1 ICDAR 2015](#121-icdar-2015)
+- [2. Text recognition](#2-text-recognition)
+ - [2.1 PaddleOCR text recognition format annotation](#21-paddleocr-text-recognition-format-annotation)
+ - [2.2 Public dataset](#22-public-dataset)
+ - [2.1 ICDAR 2015](#21-icdar-2015)
+- [3. Data storage path](#3-data-storage-path)
Here is a list of public datasets commonly used in OCR, which are being continuously updated. Welcome to contribute datasets~
@@ -129,9 +128,9 @@ Similar to the training set, the test set also needs to be provided a folder con
|ICDAR 2015| http://rrc.cvc.uab.es/?ch=4&com=downloads | [train](https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt)/ [test](https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt) |
| Multilingual datasets |[Baidu network disk](https://pan.baidu.com/s/1bS_u207Rm7YbY33wOECKDA) Extraction code: frgi
[google drive](https://drive.google.com/file/d/18cSWX7wXSy4G0tbKJ0d9PuIaiwRLHpjA/view) | Included in the downloaded image zip |
-#### 2.1 ICDAR2015
+#### 2.1 ICDAR 2015
-The ICDAR2015 dataset can be downloaded from the link in the table above for quick validation. The lmdb format dataset required by en benchmark can also be downloaded from the table above.
+The ICDAR 2015 dataset can be downloaded from the link in the table above for quick validation. The lmdb format dataset required by en benchmark can also be downloaded from the table above.
Then download the PaddleOCR format annotation file from the table above.
@@ -146,7 +145,7 @@ The data format is as follows, (a) is the original picture, (b) is the Ground Tr
![](../../datasets/icdar_rec.png)
-## 3. 数据存放路径
+## 3. Data storage path
The default storage path for PaddleOCR training data is `PaddleOCR/train_data`, if you already have a dataset on your disk, just create a soft link to the dataset directory:
diff --git a/doc/doc_en/dataset/table_datasets_en.md b/doc/doc_en/dataset/table_datasets_en.md
index 60bd61dfd1b2aa30722802673f26e8b19a2a54c0..e30147909812a153f311add50f0bef5d1d1e0e32 100644
--- a/doc/doc_en/dataset/table_datasets_en.md
+++ b/doc/doc_en/dataset/table_datasets_en.md
@@ -1,9 +1,8 @@
# Table Recognition Datasets
-- [Table Recognition Datasets](#table-recognition-datasets)
- - [Dataset Summary](#dataset-summary)
- - [1. PubTabNet](#1-pubtabnet)
- - [2. TAL Table Recognition Competition Dataset](#2-tal-table-recognition-competition-dataset)
+- [Dataset Summary](#dataset-summary)
+- [1. PubTabNet](#1-pubtabnet)
+- [2. TAL Table Recognition Competition Dataset](#2-tal-table-recognition-competition-dataset)
Here are the commonly used table recognition datasets, which are being updated continuously. Welcome to contribute datasets~
diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md
index aa5e7f41b761d7ca1919e24b6a251b3572492974..1693211fb77d1adb6fe7906f01e6d8f7a8b42c17 100644
--- a/doc/doc_en/detection_en.md
+++ b/doc/doc_en/detection_en.md
@@ -2,20 +2,19 @@
This section uses the icdar2015 dataset as an example to introduce the training, evaluation, and testing of the detection model in PaddleOCR.
-- [Text Detection](#text-detection)
- - [1. Data and Weights Preparation](#1-data-and-weights-preparation)
- - [1.1 Data Preparation](#11-data-preparation)
- - [1.2 Download Pre-trained Model](#12-download-pre-trained-model)
- - [2. Training](#2-training)
- - [2.1 Start Training](#21-start-training)
- - [2.2 Load Trained Model and Continue Training](#22-load-trained-model-and-continue-training)
- - [2.3 Training with New Backbone](#23-training-with-new-backbone)
- - [2.4 Training with knowledge distillation](#24-training-with-knowledge-distillation)
- - [3. Evaluation and Test](#3-evaluation-and-test)
- - [3.1 Evaluation](#31-evaluation)
- - [3.2 Test](#32-test)
- - [4. Inference](#4-inference)
- - [5. FAQ](#5-faq)
+- [1. Data and Weights Preparation](#1-data-and-weights-preparation)
+ - [1.1 Data Preparation](#11-data-preparation)
+ - [1.2 Download Pre-trained Model](#12-download-pre-trained-model)
+- [2. Training](#2-training)
+ - [2.1 Start Training](#21-start-training)
+ - [2.2 Load Trained Model and Continue Training](#22-load-trained-model-and-continue-training)
+ - [2.3 Training with New Backbone](#23-training-with-new-backbone)
+ - [2.4 Training with knowledge distillation](#24-training-with-knowledge-distillation)
+- [3. Evaluation and Test](#3-evaluation-and-test)
+ - [3.1 Evaluation](#31-evaluation)
+ - [3.2 Test](#32-test)
+- [4. Inference](#4-inference)
+- [5. FAQ](#5-faq)
## 1. Data and Weights Preparation
diff --git a/doc/doc_en/recognition_en.md b/doc/doc_en/recognition_en.md
index 2610e76b82dcf7448bf82b13157e7855e67e4096..2b53d8ef8fd71950e80049628570793dcd49c424 100644
--- a/doc/doc_en/recognition_en.md
+++ b/doc/doc_en/recognition_en.md
@@ -1,18 +1,17 @@
# Text Recognition
-- [Text Recognition](#text-recognition)
- - [1. Data Preparation](#1-data-preparation)
- - [1.1 DataSet Preparation](#11-dataset-preparation)
- - [1.2 Dictionary](#12-dictionary)
- - [1.4 Add Space Category](#14-add-space-category)
- - [2.Training](#2training)
- - [2.1 Data Augmentation](#21-data-augmentation)
- - [2.2 General Training](#22-general-training)
- - [2.3 Multi-language Training](#23-multi-language-training)
- - [2.4 Training with Knowledge Distillation](#24-training-with-knowledge-distillation)
- - [3. Evalution](#3-evalution)
- - [4. Prediction](#4-prediction)
- - [5. Convert to Inference Model](#5-convert-to-inference-model)
+- [1. Data Preparation](#1-data-preparation)
+ - [1.1 DataSet Preparation](#11-dataset-preparation)
+ - [1.2 Dictionary](#12-dictionary)
+ - [1.4 Add Space Category](#14-add-space-category)
+- [2.Training](#2training)
+ - [2.1 Data Augmentation](#21-data-augmentation)
+ - [2.2 General Training](#22-general-training)
+ - [2.3 Multi-language Training](#23-multi-language-training)
+ - [2.4 Training with Knowledge Distillation](#24-training-with-knowledge-distillation)
+- [3. Evalution](#3-evalution)
+- [4. Prediction](#4-prediction)
+- [5. Convert to Inference Model](#5-convert-to-inference-model)
## 1. Data Preparation