diff --git a/README.md b/README.md index 64cce90c670a842b50b5b642107061b57b996b90..4bb69766e5d1bdb9fa845efed90f43c2645ec95c 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,25 @@ English | [简体中文](README_ch.md) +

+ +

+ + +------------------------------------------------------------------------------------------ + +

+ + + + + + + + +

+ ## Introduction + PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice. ## Notice @@ -10,6 +29,10 @@ PaddleOCR supports both dynamic graph and static graph programming paradigm **Recent updates** +- PaddleOCR R&D team would like to share the released tools with developers, at 20:15 pm on August 4th, [Live Address](https://live.bilibili.com/21689802). +- 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files). +- 2021.4.8 release end-to-end text recognition algorithm [PGNet](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) which is published in AAAI 2021. Find tutorial [here](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md);release multi language recognition [models](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md), support more than 80 languages recognition; especically, the performance of [English recognition model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/models_list_en.md#English) is Optimized. + - 2021.1.21 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048). - 2020.12.15 update Data synthesis tool, i.e., [Style-Text](./StyleText/README.md),easy to synthesize a large number of images which are similar to the target scene image. - 2020.11.25 Update a new data annotation tool, i.e., [PPOCRLabel](./PPOCRLabel/README.md), which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly. @@ -80,13 +103,13 @@ For a new language request, please refer to [Guideline for new language_requests ## Tutorials - [Quick Start](./doc/doc_en/quickstart_en.md) -- [PaddleOCR Overview and Installation](./doc/doc_en/paddleOCR_overview.md) +- [PaddleOCR Overview and Installation](./doc/doc_en/paddleOCR_overview_en.md) - PP-OCR Industry Landing: from Training to Deployment - [PP-OCR Model and Configuration](./doc/doc_en/models_and_config_en.md) - [PP-OCR Model Download](./doc/doc_en/models_list_en.md) - [Yml Configuration](./doc/doc_en/config_en.md) - [Python Inference](./doc/doc_en/inference_en.md) - - [PP-OCR Training](./doc/doc_en/training.md) + - [PP-OCR Training](./doc/doc_en/training_en.md) - [Text Detection](./doc/doc_en/detection_en.md) - [Text Recognition](./doc/doc_en/recognition_en.md) - [Direction Classification](./doc/doc_en/angle_class_en.md) diff --git a/README_ch.md b/README_ch.md index 1f66613f76847007d5c2bec6d6521028130d55e8..8e9f8efc089889ce1e7c069e2dd960cd3a4cdd4d 100755 --- a/README_ch.md +++ b/README_ch.md @@ -1,22 +1,42 @@ [English](README.md) | 简体中文 +

+ +

+ + +------------------------------------------------------------------------------------------ + +

+ + + + + + + + +

+ ## 简介 + PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 ## 注意 PaddleOCR同时支持动态图与静态图两种编程范式 -- 动态图版本:dygraph分支(默认),需将paddle版本升级至2.0.0([快速安装](./doc/doc_ch/installation.md)) + +- 动态图版本:release/2.2(默认分支,开发分支为dygraph分支),需将paddle版本升级至2.0.0或以上版本([快速安装](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/installation.md)) + - 静态图版本:develop分支 **近期更新** -- 2021.4.8 release 2.1版本,新增AAAI 2021论文[端到端识别算法PGNet](./doc/doc_ch/pgnet.md)开源,[多语言模型](./doc/doc_ch/multi_languages.md)支持种类增加到80+。 -- 2021.2.1 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题,总数162个,每周一都会更新,欢迎大家持续关注。 -- 2021.1.21 更新多语言识别模型,目前支持语种超过27种,包括中文简体、中文繁体、英文、法文、德文、韩文、日文、意大利文、西班牙文、葡萄牙文、俄罗斯文、阿拉伯文等,后续计划可以参考[多语言研发计划](https://github.com/PaddlePaddle/PaddleOCR/issues/1048) -- 2020.12.15 更新数据合成工具[Style-Text](./StyleText/README_ch.md),可以批量合成大量与目标场景类似的图像,在多个场景验证,效果明显提升。 -- 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README_ch.md),辅助开发者高效完成标注任务,输出格式与PP-OCR训练任务完美衔接。 -- 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941 -- [More](./doc/doc_ch/update.md) - +- PaddleOCR研发团队对最新发版内容技术深入解读,8月4日晚上20:15,[直播地址](https://live.bilibili.com/21689802)。 +- 2021.8.3 正式发布PaddleOCR v2.2,新增文档结构分析[PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README_ch.md)工具包,支持版面分析与表格识别(含Excel导出)。 +- 2021.6.29 [FAQ](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/FAQ.md)新增5个高频问题,总数248个,每周一都会更新,欢迎大家持续关注。 +- 2021.4.8 release 2.1版本,新增AAAI 2021论文[端到端识别算法PGNet](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/pgnet.md)开源,[多语言模型](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/multi_languages.md)支持种类增加到80+。 +- 2021.2.8 正式发布PaddleOCRv2.0(branch release/2.0)并设置为推荐用户使用的默认分支. 发布的详细内容,请参考: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0 +- 2021.1.26,28,29 PaddleOCR官方研发团队带来技术深入解读三日直播课,1月26日、28日、29日晚上19:30,[直播地址](https://live.bilibili.com/21689802) +- [More](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/update.md) ## 特性 diff --git a/doc/PaddleOCR_log.png b/doc/PaddleOCR_log.png new file mode 100644 index 0000000000000000000000000000000000000000..a2df52f8565b71e6eea29782febb7b4212980ee0 Binary files /dev/null and b/doc/PaddleOCR_log.png differ diff --git a/doc/doc_en/models_and_config_en.md b/doc/doc_en/models_and_config_en.md new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/doc/doc_en/paddleOCR_overview_en.md b/doc/doc_en/paddleOCR_overview_en.md new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/doc/doc_en/quickstart_en.md b/doc/doc_en/quickstart_en.md index d6c66847ab493e0b9bca616c1f7bf3209f121f5a..4aad3f1f7bad9baba1048691698d389279345c47 100644 --- a/doc/doc_en/quickstart_en.md +++ b/doc/doc_en/quickstart_en.md @@ -69,7 +69,7 @@ If you do not use the provided test image, you can replace the following `--imag -#### 2.1.1 English and Chinese Model +#### 2.1.1 Chinese and English Model * Detection, direction classification and recognition: set the direction classifier parameter`--use_angle_cls true` to recognize vertical text. diff --git a/doc/doc_en/training_en.md b/doc/doc_en/training_en.md new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391