README.md 13.5 KB
Newer Older
W
WenmuZhou 已提交
1 2
English | [简体中文](README_ch.md)

qq_25193841's avatar
qq_25193841 已提交
3 4 5 6 7 8 9 10 11 12 13 14 15
<p align="center">
 <img src="./doc/PaddleOCR_log.png" align="middle" width = "600"/>
<p align="center">
<p align="left">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleOCR?color=ffa"></a>
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/pypi/format/PaddleOCR?color=c77"></a>
    <a href="https://pypi.org/project/PaddleOCR/"><img src="https://img.shields.io/pypi/dm/PaddleOCR?color=9cf"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf"></a>
</p>

W
WenmuZhou 已提交
16
## Introduction
qq_25193841's avatar
qq_25193841 已提交
17

L
LDOUBLEV 已提交
18
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
W
WenmuZhou 已提交
19

G
grasswolfs 已提交
20

W
WenmuZhou 已提交
21
**Recent updates**
M
MissPenguin 已提交
22 23 24 25
- 2022.5.9 release PaddleOCR v2.5, including:
    - [PP-OCRv3](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
    - [PPOCRLabelv2](./PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
    - Interactive e-book [*"Dive into OCR"*](./doc/doc_en/ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
M
MissPenguin 已提交
26 27 28
- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR, [tutorial](./ppstructure/docs/kie_en.md)) and 3 DocVQA algorithms (LayoutLM, LayoutLMv2, LayoutXLM, [tutorial](./ppstructure/vqa)).
- 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv2) is proposed. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
- 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](./ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
qq_25193841's avatar
qq_25193841 已提交
29

W
WenmuZhou 已提交
30 31
- [more](./doc/doc_en/update_en.md)

32

M
update  
MissPenguin 已提交
33
## Features
W
WenmuZhou 已提交
34

M
update  
MissPenguin 已提交
35
PaddleOCR support a variety of cutting-edge algorithms related to OCR, and developed industrial featured models/solution [PP-OCR](./doc/doc_en/ppocr_introduction_en.md) and [PP-Structure](./ppstructure/README.md) on this basis, and get through the whole process of data production, model training, compression, inference and deployment.
D
dyning 已提交
36

M
update  
MissPenguin 已提交
37
![](./doc/features_en.png)
L
LDOUBLEV 已提交
38

M
update  
MissPenguin 已提交
39
> It is recommended to start with the “quick experience” in the document tutorial
L
LDOUBLEV 已提交
40 41


W
WenmuZhou 已提交
42
## Quick Experience
D
dyning 已提交
43

M
update  
MissPenguin 已提交
44 45 46 47 48 49 50 51 52 53 54 55
- Web online experience for the ultra-lightweight OCR: [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
- Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
- One line of code quick use: [Quick Start](./doc/doc_en/quickstart_en.md)


<a name="book"></a>
## E-book: *Dive Into OCR*
- [Dive Into OCR 📚](./doc/doc_en/ocr_book_en.md)


<a name="Community"></a>
## Community
D
dyning 已提交
56

M
update  
MissPenguin 已提交
57 58 59
- **Join us**👬: Scan the QR code below with your Wechat, you can join the official technical discussion group. Looking forward to your participation.
- **Contribution**🏅️: [Contribution page](./doc/doc_en/thirdparty.md) contains various tools and applications developed by community developers using PaddleOCR, as well as the functions, optimized documents and codes contributed to PaddleOCR. It is an official honor wall for community developers and a broadcasting station to help publicize high-quality projects.
- **Regular Season**🎁: The community regular season is a point competition for OCR developers, covering four types: documents, codes, models and applications. Awards are selected and awarded on a quarterly basis. Please refer to the [link](https://github.com/PaddlePaddle/PaddleOCR/issues/4982) for more details.
T
tink2123 已提交
60

L
LDOUBLEV 已提交
61

G
grasswolfs 已提交
62
<div align="center">
M
update  
MissPenguin 已提交
63
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG"  width = "200" height = "200" />
G
grasswolfs 已提交
64
</div>
D
dyning 已提交
65

W
WenmuZhou 已提交
66 67

<a name="Supported-Chinese-model-list"></a>
68
## PP-OCR Series Model List(Update on September 8th)
W
WenmuZhou 已提交
69 70 71

| Model introduction                                           | Model name                   | Recommended scene | Detection model                                              | Direction classifier                                         | Recognition model                                            |
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
littletomatodonkey's avatar
littletomatodonkey 已提交
72 73
| Chinese and English ultra-lightweight PP-OCRv3 model(16.2M)     | ch_PP-OCRv3_xx          | Mobile & Server | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
| English ultra-lightweight PP-OCRv3 model(13.4M)     | en_PP-OCRv3_xx          | Mobile & Server | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_distill_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar) |
L
reset  
LDOUBLEV 已提交
74
| Chinese and English ultra-lightweight PP-OCRv2 model(11.6M) |  ch_PP-OCRv2_xx |Mobile & Server|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
qq_25193841's avatar
qq_25193841 已提交
75
| Chinese and English ultra-lightweight PP-OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar)      |
76
| Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar)  |
L
LDOUBLEV 已提交
77

W
WenmuZhou 已提交
78

M
update  
MissPenguin 已提交
79 80 81
- For more model downloads (including multiple languages), please refer to [PP-OCR series model downloads](./doc/doc_en/models_list_en.md).
- For a new language request, please refer to [Guideline for new language_requests](#language_requests).
- For structural document analysis models, please refer to [PP-Structure models](./ppstructure/docs/models_list_en.md).
W
WenmuZhou 已提交
82 83

## Tutorials
qq_25193841's avatar
qq_25193841 已提交
84
- [Environment Preparation](./doc/doc_en/environment_en.md)
M
update  
MissPenguin 已提交
85 86 87 88
- [PP-OCR 🔥](./doc/doc_en/ppocr_introduction_en.md)
    - [Quick Start](./doc/doc_en/quickstart_en.md)
    - [Model Zoo](./doc/doc_en/models_en.md)
    - [Model training](./doc/doc_en/training_en.md)
qq_25193841's avatar
qq_25193841 已提交
89 90
        - [Text Detection](./doc/doc_en/detection_en.md)
        - [Text Recognition](./doc/doc_en/recognition_en.md)
91
        - [Text Direction Classification](./doc/doc_en/angle_class_en.md)
M
update  
MissPenguin 已提交
92
    - Model Compression
qq_25193841's avatar
qq_25193841 已提交
93 94
        - [Model Quantization](./deploy/slim/quantization/README_en.md)
        - [Model Pruning](./deploy/slim/prune/README_en.md)
M
update  
MissPenguin 已提交
95
        - [Knowledge Distillation](./doc/doc_en/knowledge_distillation_en.md)
M
update  
MissPenguin 已提交
96
    - [Inference and Deployment](./deploy/README.md)
M
update  
MissPenguin 已提交
97 98
        - [Python Inference](./doc/doc_en/inference_ppocr_en.md)
        - [C++ Inference](./deploy/cpp_infer/readme.md)
qq_25193841's avatar
qq_25193841 已提交
99
        - [Serving](./deploy/pdserving/README.md)
M
update  
MissPenguin 已提交
100 101
        - [Mobile](./deploy/lite/readme.md)
        - [Paddle2ONNX](./deploy/paddle2onnx/readme.md)
qq_25193841's avatar
qq_25193841 已提交
102
        - [Benchmark](./doc/doc_en/benchmark_en.md)  
M
update  
MissPenguin 已提交
103 104 105
- [PP-Structure 🔥](./ppstructure/README.md)
    - [Quick Start](./ppstructure/docs/quickstart_en.md)
    - [Model Zoo](./ppstructure/docs/models_list_en.md)
106
    - [Model training](./doc/doc_en/training_en.md)  
M
update  
MissPenguin 已提交
107 108 109 110 111 112 113 114 115 116 117 118 119
        - [Layout Parser](./ppstructure/layout/README.md)
        - [Table Recognition](./ppstructure/table/README.md)
        - [DocVQA](./ppstructure/vqa/README.md)
        - [Key Information Extraction](./ppstructure/docs/kie_en.md)
    - [Inference and Deployment](./deploy/README.md)
        - [Python Inference](./ppstructure/docs/inference_en.md)
        - [C++ Inference]()
        - [Serving](./deploy/pdserving/README.md)
- [Academic algorithms](./doc/doc_en/algorithms_en.md)
    - [Text detection](./doc/doc_en/algorithm_overview_en.md)
    - [Text recognition](./doc/doc_en/algorithm_overview_en.md)
    - [End-to-end](./doc/doc_en/algorithm_overview_en.md)
    - [Add New Algorithms to PaddleOCR](./doc/doc_en/add_new_algorithm_en.md)
L
LDOUBLEV 已提交
120
- Data Annotation and Synthesis
G
grasswolfs 已提交
121
    - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
D
dyning 已提交
122
    - [Data Synthesis Tool: Style-Text](./StyleText/README.md)
G
grasswolfs 已提交
123 124
    - [Other Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
    - [Other Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
W
WenmuZhou 已提交
125
- Datasets
126 127 128
    - [General OCR Datasets(Chinese/English)](doc/doc_en/dataset/datasets_en.md)
    - [HandWritten_OCR_Datasets(Chinese)](doc/doc_en/dataset/handwritten_datasets_en.md)
    - [Various OCR Datasets(multilingual)](doc/doc_en/dataset/vertical_and_multilingual_datasets_en.md)
M
update  
MissPenguin 已提交
129
- [Code Structure](./doc/doc_en/tree_en.md)
W
WenmuZhou 已提交
130
- [Visualization](#Visualization)
M
update  
MissPenguin 已提交
131
- [Community](#Community)
L
LDOUBLEV 已提交
132
- [New language requests](#language_requests)
W
WenmuZhou 已提交
133 134 135
- [FAQ](./doc/doc_en/FAQ_en.md)
- [References](./doc/doc_en/reference_en.md)
- [License](#LICENSE)
D
dyning 已提交
136

T
tink2123 已提交
137

M
update  
MissPenguin 已提交
138
<a name="Visualization"></a>
W
WenmuZhou 已提交
139 140
## Visualization [more](./doc/doc_en/visualization_en.md)
- Chinese OCR model
D
dyning 已提交
141
<div align="center">
L
LDOUBLEV 已提交
142
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
L
LDOUBLEV 已提交
143 144
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
L
LDOUBLEV 已提交
145
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
D
dyning 已提交
146
</div>
T
tink2123 已提交
147

W
WenmuZhou 已提交
148
- English OCR model
D
dyning 已提交
149
<div align="center">
L
LDOUBLEV 已提交
150
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
D
dyning 已提交
151
</div>
152

W
WenmuZhou 已提交
153
- Multilingual OCR model
D
dyning 已提交
154
<div align="center">
L
LDOUBLEV 已提交
155
    <img src="./doc/imgs_results/french_0.jpg" width="800">
L
LDOUBLEV 已提交
156
    <img src="./doc/imgs_results/korean.jpg" width="800">
D
dyning 已提交
157
</div>
T
tink2123 已提交
158

D
dyning 已提交
159

L
LDOUBLEV 已提交
160
<a name="language_requests"></a>
161
## Guideline for New Language Requests
L
LDOUBLEV 已提交
162

A
andyjpaddle 已提交
163
If you want to request a new language support, a PR with 1 following files are needed:
L
LDOUBLEV 已提交
164

G
grasswolfs 已提交
165
1. In folder [ppocr/utils/dict](./ppocr/utils/dict),
L
LDOUBLEV 已提交
166 167 168 169 170 171
it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).

M
MissPenguin 已提交
172

W
WenmuZhou 已提交
173 174
<a name="LICENSE"></a>
## License
175
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>