From 4ac60f9eda36ababd89f97fba3e091637eb795b6 Mon Sep 17 00:00:00 2001 From: tink2123 Date: Tue, 13 Oct 2020 14:37:29 +0800 Subject: [PATCH] add corpus floder, modified depoly doc --- deploy/lite/readme.md | 8 ++++---- deploy/lite/readme_en.md | 8 ++++---- ppocr/utils/corpus/corpus.md | 6 ++++++ ppocr/utils/corpus/corpus_ch.md | 8 ++++++++ 4 files changed, 22 insertions(+), 8 deletions(-) create mode 100644 ppocr/utils/corpus/corpus.md create mode 100644 ppocr/utils/corpus/corpus_ch.md diff --git a/deploy/lite/readme.md b/deploy/lite/readme.md index c3db398d..0695ef03 100644 --- a/deploy/lite/readme.md +++ b/deploy/lite/readme.md @@ -221,11 +221,11 @@ demo/cxx/ocr/ 1. ppocr_keys_v1.txt是中文字典文件,如果使用的 nb 模型是英文数字或其他语言的模型,需要更换为对应语言的字典。 PaddleOCR 在ppocr/utils/下存放了多种字典,包括: ``` -french_dict.txt # 法语字典 -german_dict.txt # 德语字典 +dict/french_dict.txt # 法语字典 +dict/german_dict.txt # 德语字典 ic15_dict.txt # 英文字典 -japan_dict.txt # 日语字典 -korean_dict.txt # 韩语字典 +dict/japan_dict.txt # 日语字典 +dict/korean_dict.txt # 韩语字典 ppocr_keys_v1.txt # 中文字典 ``` diff --git a/deploy/lite/readme_en.md b/deploy/lite/readme_en.md index ac5ce538..02491d31 100644 --- a/deploy/lite/readme_en.md +++ b/deploy/lite/readme_en.md @@ -185,11 +185,11 @@ demo/cxx/ocr/ If the nb model is used for English recognition or other language recognition, dictionary file should be replaced with a dictionary of the corresponding language. PaddleOCR provides a variety of dictionaries under ppocr/utils/, including: ``` -french_dict.txt # french -german_dict.txt # german +dict/french_dict.txt # french +dict/german_dict.txt # german ic15_dict.txt # english -japan_dict.txt # japan -korean_dict.txt # korean +dict/japan_dict.txt # japan +dict/korean_dict.txt # korean ppocr_keys_v1.txt # chinese ``` diff --git a/ppocr/utils/corpus/corpus.md b/ppocr/utils/corpus/corpus.md new file mode 100644 index 00000000..defd765e --- /dev/null +++ b/ppocr/utils/corpus/corpus.md @@ -0,0 +1,6 @@ +# Waiting for your contribution + +PaddleOCR welcomes you to provide multilingual corpus for us to synthesize more data to optimize the model. + +If you are interested, you can submit the corpus text to this directory and name it with {language}_corpus.txt. +PaddleOCR thanks for your contribution \ No newline at end of file diff --git a/ppocr/utils/corpus/corpus_ch.md b/ppocr/utils/corpus/corpus_ch.md new file mode 100644 index 00000000..de1db54b --- /dev/null +++ b/ppocr/utils/corpus/corpus_ch.md @@ -0,0 +1,8 @@ +# 欢迎贡献语料 + +PaddleOCR非常欢迎您提供多语言的语料,以供我们合成更多数据来优化模型。 + +如您感兴趣,可将语料文本提交到此目录,并以 {语言}_corpus.txt 命名。PaddleOCR团队感谢你的贡献 + + + -- GitLab