未验证 提交 83f8a4d9 编写于 作者: Z zhoujun 提交者: GitHub

update table recognition finetune doc (#8353)

* add can, stt to algorithm_overview_en.md

* update table recognition finetune
上级 f68813eb
......@@ -14,6 +14,9 @@
- [2.5. 分布式训练](#25-分布式训练)
- [2.6. 其他训练环境](#26-其他训练环境)
- [2.7. 模型微调](#27-模型微调)
- [2.7.1 数据选择](#271-数据选择)
- [2.7.2 模型选择](#272-模型选择)
- [2.7.3 训练超参选择](#273-训练超参选择)
- [3. 模型评估与预测](#3-模型评估与预测)
- [3.1. 指标评估](#31-指标评估)
- [3.2. 测试表格结构识别效果](#32-测试表格结构识别效果)
......@@ -219,7 +222,39 @@ DCU设备上运行需要设置环境变量 `export HIP_VISIBLE_DEVICES=0,1,2,3`
## 2.7. 模型微调
实际使用过程中,建议加载官方提供的预训练模型,在自己的数据集中进行微调,关于模型的微调方法,请参考:[模型微调教程](./finetune.md)。
### 2.7.1 数据选择
数据量:建议至少准备2000张的表格识别数据集用于模型微调。
### 2.7.2 模型选择
建议选择SLANet模型(配置文件:[SLANet_ch.yml](../../configs/table/SLANet_ch.yml),预训练模型:[ch_ppstructure_mobile_v2.0_SLANet_train.tar](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar))进行微调,其精度与泛化性能是目前提供的最优中文表格预训练模型。
更多表格识别模型,请参考[PP-Structure 系列模型库](../../ppstructure/docs/models_list.md)。
### 2.7.3 训练超参选择
在模型微调的时候,最重要的超参就是预训练模型路径`pretrained_model`, 学习率`learning_rate`,部分配置文件如下所示。
```yaml
Global:
pretrained_model: ./ch_ppstructure_mobile_v2.0_SLANet_train/best_accuracy.pdparams # 预训练模型路径
Optimizer:
lr:
name: Cosine
learning_rate: 0.001 #
warmup_epoch: 0
regularizer:
name: 'L2'
factor: 0
```
上述配置文件中,首先需要将`pretrained_model`字段指定为`best_accuracy.pdparams`文件路径。
PaddleOCR提供的配置文件是在4卡训练(相当于总的batch size是`4*48=192`)、且没有加载预训练模型情况下的配置文件,因此您的场景中,学习率与总的batch size需要对应线性调整,例如
* 如果您的场景中是单卡训练,单卡batch_size=48,则总的batch_size=48,建议将学习率调整为`0.00025`左右。
* 如果您的场景中是单卡训练,由于显存限制,只能设置单卡batch_size=32,则总的batch_size=32,建议将学习率调整为`0.00017`左右。
# 3. 模型评估与预测
......
......@@ -14,6 +14,9 @@ This article provides a full-process guide for the PaddleOCR table recognition m
- [2.5. Distributed Training](#25-distributed-training)
- [2.6. Training on other platform(Windows/macOS/Linux DCU)](#26-training-on-other-platformwindowsmacoslinux-dcu)
- [2.7. Fine-tuning](#27-fine-tuning)
- [2.7.1 Dataset](#271-dataset)
- [2.7.2 model selection](#272-model-selection)
- [2.7.3 Training hyperparameter selection](#273-training-hyperparameter-selection)
- [3. Evaluation and Test](#3-evaluation-and-test)
- [3.1. Evaluation](#31-evaluation)
- [3.2. Test table structure recognition effect](#32-test-table-structure-recognition-effect)
......@@ -226,8 +229,40 @@ Running on a DCU device requires setting the environment variable `export HIP_VI
## 2.7. Fine-tuning
In the actual use process, it is recommended to load the officially provided pre-training model and fine-tune it in your own data set. For the fine-tuning method of the table recognition model, please refer to: [Model fine-tuning tutorial](./finetune.md).
### 2.7.1 Dataset
Data number: It is recommended to prepare at least 2000 table recognition datasets for model fine-tuning.
### 2.7.2 model selection
It is recommended to choose the SLANet model (configuration file: [SLANet_ch.yml](../../configs/table/SLANet_ch.yml), pre-training model: [ch_ppstructure_mobile_v2.0_SLANet_train.tar](https://paddleocr.bj.bcebos .com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar)) for fine-tuning, its accuracy and generalization performance is the best Chinese table pre-training model currently available.
For more table recognition models, please refer to [PP-Structure Series Model Library](../../ppstructure/docs/models_list.md).
### 2.7.3 Training hyperparameter selection
When fine-tuning the model, the most important hyperparameters are the pretrained model path `pretrained_model`, the learning rate `learning_rate`, and some configuration files are shown below.
```yaml
Global:
pretrained_model: ./ch_ppstructure_mobile_v2.0_SLANet_train/best_accuracy.pdparams # Pre-trained model path
Optimizer:
lr:
name: Cosine
learning_rate: 0.001 #
warmup_epoch: 0
regularizer:
name: 'L2'
factor: 0
```
In the above configuration file, you first need to specify the `pretrained_model` field as the `best_accuracy.pdparams` file path.
The configuration file provided by PaddleOCR is for 4-card training (equivalent to a total batch size of `4*48=192`) and no pre-trained model is loaded. Therefore, in your scenario, the learning rate is the same as the total The batch size needs to be adjusted linearly, for example
* If your scenario is single card training, single card batch_size=48, then the total batch_size=48, it is recommended to adjust the learning rate to about `0.00025`.
* If your scenario is for single-card training, due to memory limitations, you can only set batch_size=32 for a single card, then the total batch_size=32, it is recommended to adjust the learning rate to about `0.00017`.
# 3. Evaluation and Test
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册