From 5c75828322a1037e89daf18efc36f89f9df93377 Mon Sep 17 00:00:00 2001
From: xmy0916 <863299715@qq.com>
Date: Wed, 20 Jan 2021 22:22:36 +0800
Subject: [PATCH] fix doc
---
doc/doc_ch/models_list.md | 32 +++++++++++++++++++++-----------
doc/doc_en/models_list_en.md | 22 +++++++++++++++-------
2 files changed, 36 insertions(+), 18 deletions(-)
diff --git a/doc/doc_ch/models_list.md b/doc/doc_ch/models_list.md
index b5727b6f..c7b83d96 100644
--- a/doc/doc_ch/models_list.md
+++ b/doc/doc_ch/models_list.md
@@ -52,28 +52,38 @@ PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训
#### 3. 多语言识别模型(更多语言持续更新中...)
-**说明:** 新增的多语言模型的配置文件通过代码方式生成,以生成意大利语配置文件为例:
-
+**说明:** 新增的多语言模型的配置文件通过代码方式生成,您可以通过`--help`参数查看当前PaddleOCR支持生成哪些多语言的配置文件:
```bash
# 该代码需要在指定目录运行
cd PaddleOCR/configs/rec/multi_language/
-# 通过-l或者--language参数设置需要生成的语种的配置文件,该命令会将默认参数写入配置文件
-python3 generate_multi_language_configs.py -l it
+python3 generate_multi_language_configs.py --help
```
-您可以通过`--help`参数查看当前PaddleOCR支持生成哪些多语言的配置文件:
+下面以生成意大利语配置文件为例:
+##### 1. 生成意大利语配置文件测试现有模型
+如果您仅仅想用配置文件测试PaddleOCR提供的多语言模型可以通过下面命令生成默认的配置文件,使用PaddleOCR提供的小语种字典进行预测。
```bash
-python3 generate_multi_language_configs.py --help
+# 该代码需要在指定目录运行
+cd PaddleOCR/configs/rec/multi_language/
+# 通过-l或者--language参数设置需要生成的语种的配置文件,该命令会将默认参数写入配置文件
+python3 generate_multi_language_configs.py -l it
```
-如果您不想使用默认的路径或者默认参数可以根据以下命令修改:
-
+##### 2. 生成意大利语配置文件训练自己的数据
+如果您想训练自己的小语种模型,可以准备好训练集文件、验证集文件、字典文件和训练数据路径,这里假设准备的意大利语的训练集、验证集、字典和训练数据路径为:
+- 训练集:{your/path/}PaddleOCR/train_data/train_list.txt
+- 验证集:{your/path/}PaddleOCR/train_data/val_list.txt
+- 使用PaddleOCR提供的默认字典:{your/path/}PaddleOCR/ppocr/utils/dict/it_dict.txt
+- 训练数据路径:{your/path/}PaddleOCR/train_data
+
+使用以下命令生成配置文件:
```bash
# -l或者--language字段是必须的
# --train修改训练集,--val修改验证集,--data_dir修改数据集目录,-o修改对应默认参数
+# --dict命令改变字典路径,示例使用默认字典路径则该参数可不填
python3 generate_multi_language_configs.py -l it \
---train {path/to/train_list} \
---val {path/to/val_list} \
---data_dir {path/to/data_dir} \
+--train train_data/train_list.txt \
+--val train_data/val_list.txt \
+--data_dir train_data \
-o Global.use_gpu=False
```
diff --git a/doc/doc_en/models_list_en.md b/doc/doc_en/models_list_en.md
index 33b0dede..578badc1 100644
--- a/doc/doc_en/models_list_en.md
+++ b/doc/doc_en/models_list_en.md
@@ -51,8 +51,15 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
#### Multilingual Recognition Model(Updating...)
-**Note:** The configuration file of the new multi language model is generated by code. Take the Italian configuration file as an example:
+**Note:** The configuration file of the new multi language model is generated by code. You can use the `--help` parameter to check which multi language are supported by current PaddleOCR.
+```bash
+python3 generate_multi_language_configs.py --help
+```
+
+Take the Italian configuration file as an example:
+##### 1.Generate Italian configuration file to test the model provided
+you can generate the default configuration file through the following command, and use the default language dictionary provided by paddleocr for prediction.
```bash
# The code needs to run in the specified directory
cd PaddleOCR/configs/rec/multi_language/
@@ -60,18 +67,19 @@ cd PaddleOCR/configs/rec/multi_language/
# This command will write the default parameter to the configuration file.
python3 generate_multi_language_configs.py -l it
```
-You can use the `--help` parameter to check which multi language are supported by current PaddleOCR.
-
-```bash
-python3 generate_multi_language_configs.py --help
-```
-If you don't want to use the default path or default parameters, you can modify them according to the following command:
+##### 2. Generate Italian configuration file to train your own data
+If you want to train your own model, you can prepare the training set file, verification set file, dictionary file and training data path. Here we assume that the Italian training set, verification set, dictionary and training data path are:
+- Training set:{your/path/}PaddleOCR/train_data/train_list.txt
+- Validation set: {your/path/}PaddleOCR/train_data/val_list.txt
+- Use the default dictionary provided by paddleocr:{your/path/}PaddleOCR/ppocr/utils/dict/it_dict.txt
+- Training data path:{your/path/}PaddleOCR/train_data
```bash
# The -l or --language parameter is required
# --train modify train_list path
# --val modify eval_list path
# --data_dir modify data dir
# -o modify default parameters
+# --dict Change the dictionary path. The example uses the default dictionary path, so that this parameter can be empty.
python3 generate_multi_language_configs.py -l it \
--train {path/to/train_list} \
--val {path/to/val_list} \
--
GitLab