README.md 16.3 KB
Newer Older
W
WenmuZhou 已提交
1 2
English | [简体中文](README_ch.md)

qq_25193841's avatar
qq_25193841 已提交
3 4 5 6 7 8 9 10 11 12 13 14 15
<p align="center">
 <img src="./doc/PaddleOCR_log.png" align="middle" width = "600"/>
<p align="center">
<p align="left">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleOCR?color=ffa"></a>
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/pypi/format/PaddleOCR?color=c77"></a>
    <a href="https://pypi.org/project/PaddleOCR/"><img src="https://img.shields.io/pypi/dm/PaddleOCR?color=9cf"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf"></a>
</p>

W
WenmuZhou 已提交
16
## Introduction
qq_25193841's avatar
qq_25193841 已提交
17

L
LDOUBLEV 已提交
18
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
W
WenmuZhou 已提交
19

M
MissPenguin 已提交
20
<div align="center">
littletomatodonkey's avatar
littletomatodonkey 已提交
21
    <img src="./doc/imgs_results/PP-OCRv3/en/en_4.png" width="800">
M
MissPenguin 已提交
22 23 24
</div>

<div align="center">
littletomatodonkey's avatar
fix doc  
littletomatodonkey 已提交
25
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00006737.jpg" width="800">
M
MissPenguin 已提交
26
</div>
G
grasswolfs 已提交
27

M
MissPenguin 已提交
28
## Recent updates
M
MissPenguin 已提交
29
- **🔥2022.8.24 Release PaddleOCR [release/2.6](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.6)**
M
MissPenguin 已提交
30
  - Release [PP-Structurev2](./ppstructure/),with functions and performance fully upgraded, adapted to Chinese scenes, and new support for [Layout Recovery](./ppstructure/recovery) and **one line command to convert PDF to Word**;
M
MissPenguin 已提交
31 32 33 34
  - [Layout Analysis](./ppstructure/layout) optimization: model storage reduced by 95%, while speed increased by 11 times, and the average CPU time-cost is only 41ms;
  - [Table Recognition](./ppstructure/table) optimization: 3 optimization strategies are designed, and the model accuracy is improved by 6% under comparable time consumption;
  - [Key Information Extraction](./ppstructure/kie) optimization:a visual-independent model structure is designed, the accuracy of semantic entity recognition is increased by 2.8%, and the accuracy of relation extraction is increased by 9.1%.
  
M
MissPenguin 已提交
35
- **🔥2022.8 Release [OCR scene application collection](./applications/README_en.md)**
M
MissPenguin 已提交
36 37
    - Release **9 vertical models** such as digital tube, LCD screen, license plate, handwriting recognition model, high-precision SVTR model, etc, covering the main OCR vertical applications in general, manufacturing, finance, and transportation industries.

M
MissPenguin 已提交
38 39
- **2022.8 Add implementation of [8 cutting-edge algorithms](doc/doc_en/algorithm_overview_en.md)**
  - Text Detection: [FCENet](doc/doc_en/algorithm_det_fcenet_en.md), [DB++](doc/doc_en/algorithm_det_db_en.md)
M
MissPenguin 已提交
40 41 42 43
  - Text Recognition: [ViTSTR](doc/doc_en/algorithm_rec_vitstr_en.md), [ABINet](doc/doc_en/algorithm_rec_abinet_en.md), [VisionLAN](doc/doc_en/algorithm_rec_visionlan_en.md), [SPIN](doc/doc_en/algorithm_rec_spin_en.md), [RobustScanner](doc/doc_en/algorithm_rec_robustscanner_en.md)
  - Table Recognition: [TableMaster](doc/doc_en/algorithm_table_master_en.md)
  
- **2022.5.9 Release PaddleOCR [release/2.5](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.5)**
M
MissPenguin 已提交
44 45 46
    - Release [PP-OCRv3](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
    - Release [PPOCRLabelv2](./PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
    - Release interactive e-book [*"Dive into OCR"*](./doc/doc_en/ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
qq_25193841's avatar
qq_25193841 已提交
47

W
WenmuZhou 已提交
48 49
- [more](./doc/doc_en/update_en.md)

50

M
update  
MissPenguin 已提交
51
## Features
W
WenmuZhou 已提交
52

M
update  
MissPenguin 已提交
53
PaddleOCR support a variety of cutting-edge algorithms related to OCR, and developed industrial featured models/solution [PP-OCR](./doc/doc_en/ppocr_introduction_en.md) and [PP-Structure](./ppstructure/README.md) on this basis, and get through the whole process of data production, model training, compression, inference and deployment.
D
dyning 已提交
54

M
MissPenguin 已提交
55 56 57
<div align="center">
    <img src="https://user-images.githubusercontent.com/25809855/186171245-40abc4d7-904f-4949-ade1-250f86ed3a90.png">
</div>
L
LDOUBLEV 已提交
58

M
update  
MissPenguin 已提交
59
> It is recommended to start with the “quick experience” in the document tutorial
L
LDOUBLEV 已提交
60 61


W
WenmuZhou 已提交
62
## Quick Experience
D
dyning 已提交
63

M
update  
MissPenguin 已提交
64 65 66 67 68 69 70 71 72 73 74 75
- Web online experience for the ultra-lightweight OCR: [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
- Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
- One line of code quick use: [Quick Start](./doc/doc_en/quickstart_en.md)


<a name="book"></a>
## E-book: *Dive Into OCR*
- [Dive Into OCR 📚](./doc/doc_en/ocr_book_en.md)


<a name="Community"></a>
## Community
E
Evezerest 已提交
76
- For international developers, we regard [PaddleOCR Discussions](https://github.com/PaddlePaddle/PaddleOCR/discussions) as our international community platform. All ideas and questions can be discussed here in English.
D
dyning 已提交
77

E
Evezerest 已提交
78
- For Chinese develops, Scan the QR code below with your Wechat, you can join the official technical discussion group. For richer community content, please refer to [中文README](README_ch.md), looking forward to your participation.
L
LDOUBLEV 已提交
79

G
grasswolfs 已提交
80
<div align="center">
E
Evezerest 已提交
81
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG"  width = "150" height = "150" />
G
grasswolfs 已提交
82
</div>
E
Evezerest 已提交
83

W
WenmuZhou 已提交
84
<a name="Supported-Chinese-model-list"></a>
M
MissPenguin 已提交
85

86
## PP-OCR Series Model List(Update on September 8th)
W
WenmuZhou 已提交
87 88 89

| Model introduction                                           | Model name                   | Recommended scene | Detection model                                              | Direction classifier                                         | Recognition model                                            |
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
littletomatodonkey's avatar
littletomatodonkey 已提交
90 91
| Chinese and English ultra-lightweight PP-OCRv3 model(16.2M)     | ch_PP-OCRv3_xx          | Mobile & Server | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
| English ultra-lightweight PP-OCRv3 model(13.4M)     | en_PP-OCRv3_xx          | Mobile & Server | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_distill_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar) |
L
reset  
LDOUBLEV 已提交
92
| Chinese and English ultra-lightweight PP-OCRv2 model(11.6M) |  ch_PP-OCRv2_xx |Mobile & Server|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
qq_25193841's avatar
qq_25193841 已提交
93
| Chinese and English ultra-lightweight PP-OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar)      |
94
| Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar)  |
L
LDOUBLEV 已提交
95

W
WenmuZhou 已提交
96

M
update  
MissPenguin 已提交
97 98 99
- For more model downloads (including multiple languages), please refer to [PP-OCR series model downloads](./doc/doc_en/models_list_en.md).
- For a new language request, please refer to [Guideline for new language_requests](#language_requests).
- For structural document analysis models, please refer to [PP-Structure models](./ppstructure/docs/models_list_en.md).
W
WenmuZhou 已提交
100 101

## Tutorials
qq_25193841's avatar
qq_25193841 已提交
102
- [Environment Preparation](./doc/doc_en/environment_en.md)
M
update  
MissPenguin 已提交
103 104 105 106
- [PP-OCR 🔥](./doc/doc_en/ppocr_introduction_en.md)
    - [Quick Start](./doc/doc_en/quickstart_en.md)
    - [Model Zoo](./doc/doc_en/models_en.md)
    - [Model training](./doc/doc_en/training_en.md)
qq_25193841's avatar
qq_25193841 已提交
107 108
        - [Text Detection](./doc/doc_en/detection_en.md)
        - [Text Recognition](./doc/doc_en/recognition_en.md)
109
        - [Text Direction Classification](./doc/doc_en/angle_class_en.md)
M
update  
MissPenguin 已提交
110
    - Model Compression
qq_25193841's avatar
qq_25193841 已提交
111 112
        - [Model Quantization](./deploy/slim/quantization/README_en.md)
        - [Model Pruning](./deploy/slim/prune/README_en.md)
M
update  
MissPenguin 已提交
113
        - [Knowledge Distillation](./doc/doc_en/knowledge_distillation_en.md)
M
update  
MissPenguin 已提交
114
    - [Inference and Deployment](./deploy/README.md)
M
update  
MissPenguin 已提交
115 116
        - [Python Inference](./doc/doc_en/inference_ppocr_en.md)
        - [C++ Inference](./deploy/cpp_infer/readme.md)
qq_25193841's avatar
qq_25193841 已提交
117
        - [Serving](./deploy/pdserving/README.md)
M
update  
MissPenguin 已提交
118 119
        - [Mobile](./deploy/lite/readme.md)
        - [Paddle2ONNX](./deploy/paddle2onnx/readme.md)
Jeffrey Chen's avatar
Jeffrey Chen 已提交
120
        - [PaddleCloud](./deploy/paddlecloud/README.md)
qq_25193841's avatar
qq_25193841 已提交
121
        - [Benchmark](./doc/doc_en/benchmark_en.md)  
M
update  
MissPenguin 已提交
122 123 124
- [PP-Structure 🔥](./ppstructure/README.md)
    - [Quick Start](./ppstructure/docs/quickstart_en.md)
    - [Model Zoo](./ppstructure/docs/models_list_en.md)
125
    - [Model training](./doc/doc_en/training_en.md)  
M
MissPenguin 已提交
126
        - [Layout Analysis](./ppstructure/layout/README.md)
M
update  
MissPenguin 已提交
127
        - [Table Recognition](./ppstructure/table/README.md)
M
MissPenguin 已提交
128
        - [Key Information Extraction](./ppstructure/kie/README.md)
M
update  
MissPenguin 已提交
129 130
    - [Inference and Deployment](./deploy/README.md)
        - [Python Inference](./ppstructure/docs/inference_en.md)
M
MissPenguin 已提交
131
        - [C++ Inference](./deploy/cpp_infer/readme.md)
M
MissPenguin 已提交
132
        - [Serving](./deploy/hubserving/readme_en.md)
M
MissPenguin 已提交
133
- [Academic Algorithms](./doc/doc_en/algorithm_overview_en.md)
M
update  
MissPenguin 已提交
134 135
    - [Text detection](./doc/doc_en/algorithm_overview_en.md)
    - [Text recognition](./doc/doc_en/algorithm_overview_en.md)
M
MissPenguin 已提交
136 137 138
    - [End-to-end OCR](./doc/doc_en/algorithm_overview_en.md)
    - [Table Recognition](./doc/doc_en/algorithm_overview_en.md)
    - [Key Information Extraction](./doc/doc_en/algorithm_overview_en.md)    
M
update  
MissPenguin 已提交
139
    - [Add New Algorithms to PaddleOCR](./doc/doc_en/add_new_algorithm_en.md)
L
LDOUBLEV 已提交
140
- Data Annotation and Synthesis
G
grasswolfs 已提交
141
    - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
D
dyning 已提交
142
    - [Data Synthesis Tool: Style-Text](./StyleText/README.md)
G
grasswolfs 已提交
143 144
    - [Other Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
    - [Other Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
W
WenmuZhou 已提交
145
- Datasets
146 147 148
    - [General OCR Datasets(Chinese/English)](doc/doc_en/dataset/datasets_en.md)
    - [HandWritten_OCR_Datasets(Chinese)](doc/doc_en/dataset/handwritten_datasets_en.md)
    - [Various OCR Datasets(multilingual)](doc/doc_en/dataset/vertical_and_multilingual_datasets_en.md)
M
MissPenguin 已提交
149 150 151
    - [Layout Analysis](doc/doc_en/dataset/layout_datasets_en.md)
    - [Table Recognition](doc/doc_en/dataset/table_datasets_en.md)
    - [Key Information Extraction](doc/doc_en/dataset/kie_datasets_en.md)
M
update  
MissPenguin 已提交
152
- [Code Structure](./doc/doc_en/tree_en.md)
W
WenmuZhou 已提交
153
- [Visualization](#Visualization)
M
update  
MissPenguin 已提交
154
- [Community](#Community)
L
LDOUBLEV 已提交
155
- [New language requests](#language_requests)
W
WenmuZhou 已提交
156 157 158
- [FAQ](./doc/doc_en/FAQ_en.md)
- [References](./doc/doc_en/reference_en.md)
- [License](#LICENSE)
D
dyning 已提交
159

T
tink2123 已提交
160

M
update  
MissPenguin 已提交
161
<a name="Visualization"></a>
W
WenmuZhou 已提交
162
## Visualization [more](./doc/doc_en/visualization_en.md)
M
MissPenguin 已提交
163 164

<details open>
littletomatodonkey's avatar
littletomatodonkey 已提交
165
<summary>PP-OCRv3 Chinese model</summary>
D
dyning 已提交
166
<div align="center">
littletomatodonkey's avatar
littletomatodonkey 已提交
167 168 169
    <img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic001.jpg" width="800">
    <img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic002.jpg" width="800">
    <img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic003.jpg" width="800">
D
dyning 已提交
170
</div>
M
MissPenguin 已提交
171
</details>
T
tink2123 已提交
172

M
MissPenguin 已提交
173
<details open>
littletomatodonkey's avatar
littletomatodonkey 已提交
174
<summary>PP-OCRv3 English model</summary>
D
dyning 已提交
175
<div align="center">
littletomatodonkey's avatar
littletomatodonkey 已提交
176 177
    <img src="doc/imgs_results/PP-OCRv3/en/en_1.png" width="800">
    <img src="doc/imgs_results/PP-OCRv3/en/en_2.png" width="800">
D
dyning 已提交
178
</div>
M
MissPenguin 已提交
179
</details>
180

M
MissPenguin 已提交
181
<details open>
littletomatodonkey's avatar
littletomatodonkey 已提交
182
<summary>PP-OCRv3 Multilingual model</summary>
D
dyning 已提交
183
<div align="center">
littletomatodonkey's avatar
littletomatodonkey 已提交
184 185
    <img src="doc/imgs_results/PP-OCRv3/multi_lang/japan_2.jpg" width="800">
    <img src="doc/imgs_results/PP-OCRv3/multi_lang/korean_1.jpg" width="800">
D
dyning 已提交
186
</div>
M
MissPenguin 已提交
187
</details>
D
dyning 已提交
188

M
MissPenguin 已提交
189
<details open>
M
MissPenguin 已提交
190
<summary>PP-Structurev2</summary>
M
MissPenguin 已提交
191 192 193 194 195 196 197 198

- layout analysis + table recognition  
<div align="center">
    <img src="./ppstructure/docs/table/ppstructure.GIF" width="800">
</div>

- SER (Semantic entity recognition)
<div align="center">
M
MissPenguin 已提交
199 200 201 202 203 204 205 206 207
    <img src="https://user-images.githubusercontent.com/25809855/186094456-01a1dd11-1433-4437-9ab2-6480ac94ec0a.png" width="600">
</div>
    
<div align="center">
    <img src="https://user-images.githubusercontent.com/14270174/185310636-6ce02f7c-790d-479f-b163-ea97a5a04808.jpg" width="600">
</div>

<div align="center">
    <img src="https://user-images.githubusercontent.com/14270174/185539517-ccf2372a-f026-4a7c-ad28-c741c770f60a.png" width="600">
M
MissPenguin 已提交
208 209 210 211
</div>

- RE (Relation Extraction)
<div align="center">
M
MissPenguin 已提交
212 213 214 215 216 217 218 219 220
    <img src="https://user-images.githubusercontent.com/25809855/186094813-3a8e16cc-42e5-4982-b9f4-0134dfb5688d.png" width="600">
</div>   

<div align="center">
    <img src="https://user-images.githubusercontent.com/14270174/185393805-c67ff571-cf7e-4217-a4b0-8b396c4f22bb.jpg" width="600">
</div>

<div align="center">
    <img src="https://user-images.githubusercontent.com/14270174/185540080-0431e006-9235-4b6d-b63d-0b3c6e1de48f.jpg" width="600">
M
MissPenguin 已提交
221 222 223 224
</div>

</details>

L
LDOUBLEV 已提交
225
<a name="language_requests"></a>
226
## Guideline for New Language Requests
L
LDOUBLEV 已提交
227

A
andyjpaddle 已提交
228
If you want to request a new language support, a PR with 1 following files are needed:
L
LDOUBLEV 已提交
229

G
grasswolfs 已提交
230
1. In folder [ppocr/utils/dict](./ppocr/utils/dict),
L
LDOUBLEV 已提交
231 232 233 234 235 236
it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).

M
MissPenguin 已提交
237

W
WenmuZhou 已提交
238 239
<a name="LICENSE"></a>
## License
240
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>