diff --git a/README.md b/README.md index f57672e5055df042ede9ae03bbed590889c5941c..58e436c99220d6b4ec5e4e4934bbeeed66408503 100644 --- a/README.md +++ b/README.md @@ -113,18 +113,19 @@ PaddleOCR support a variety of cutting-edge algorithms related to OCR, and devel - [Quick Start](./ppstructure/docs/quickstart_en.md) - [Model Zoo](./ppstructure/docs/models_list_en.md) - [Model training](./doc/doc_en/training_en.md) - - [Layout Parser](./ppstructure/layout/README.md) + - [Layout Analysis](./ppstructure/layout/README.md) - [Table Recognition](./ppstructure/table/README.md) - - [DocVQA](./ppstructure/vqa/README.md) - - [Key Information Extraction](./ppstructure/docs/kie_en.md) + - [Key Information Extraction](./ppstructure/kie/README.md) - [Inference and Deployment](./deploy/README.md) - [Python Inference](./ppstructure/docs/inference_en.md) - - [C++ Inference]() + - [C++ Inference](./deploy/cpp_infer/readme.md) - [Serving](./deploy/pdserving/README.md) -- [Academic algorithms](./doc/doc_en/algorithms_en.md) +- [Academic Algorithms](./doc/doc_en/algorithm_overview_en.md) - [Text detection](./doc/doc_en/algorithm_overview_en.md) - [Text recognition](./doc/doc_en/algorithm_overview_en.md) - - [End-to-end](./doc/doc_en/algorithm_overview_en.md) + - [End-to-end OCR](./doc/doc_en/algorithm_overview_en.md) + - [Table Recognition](./doc/doc_en/algorithm_overview_en.md) + - [Key Information Extraction](./doc/doc_en/algorithm_overview_en.md) - [Add New Algorithms to PaddleOCR](./doc/doc_en/add_new_algorithm_en.md) - Data Annotation and Synthesis - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md) @@ -135,9 +136,9 @@ PaddleOCR support a variety of cutting-edge algorithms related to OCR, and devel - [General OCR Datasets(Chinese/English)](doc/doc_en/dataset/datasets_en.md) - [HandWritten_OCR_Datasets(Chinese)](doc/doc_en/dataset/handwritten_datasets_en.md) - [Various OCR Datasets(multilingual)](doc/doc_en/dataset/vertical_and_multilingual_datasets_en.md) - - [layout analysis](doc/doc_en/dataset/layout_datasets_en.md) - - [table recognition](doc/doc_en/dataset/table_datasets_en.md) - - [DocVQA](doc/doc_en/dataset/docvqa_datasets_en.md) + - [Layout Analysis](doc/doc_en/dataset/layout_datasets_en.md) + - [Table Recognition](doc/doc_en/dataset/table_datasets_en.md) + - [Key Information Extraction](doc/doc_en/dataset/kie_datasets_en.md) - [Code Structure](./doc/doc_en/tree_en.md) - [Visualization](#Visualization) - [Community](#Community) @@ -176,7 +177,7 @@ PaddleOCR support a variety of cutting-edge algorithms related to OCR, and devel
-PP-Structure +PP-Structurev2 - layout analysis + table recognition
@@ -185,12 +186,28 @@ PaddleOCR support a variety of cutting-edge algorithms related to OCR, and devel - SER (Semantic entity recognition)
- + +
+ +
+ +
+ +
+
- RE (Relation Extraction)
- + +
+ +
+ +
+ +
+
diff --git a/README_ch.md b/README_ch.md index c52d5f3dd17839254c3f58794e016f08dc0b21bc..49e84e15fd429ecb26c6c579857920882a1145d6 100755 --- a/README_ch.md +++ b/README_ch.md @@ -213,14 +213,30 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - SER(语义实体识别)
- +
+
+ +
+ +
+ +
+ - RE(关系提取)
- +
+
+ +
+ +
+ +
+ diff --git a/doc/doc_en/algorithm_en.md b/doc/doc_en/algorithm_en.md deleted file mode 100644 index c880336b4ad528eab2cce479edf11fce0b43f435..0000000000000000000000000000000000000000 --- a/doc/doc_en/algorithm_en.md +++ /dev/null @@ -1,11 +0,0 @@ -# Academic Algorithms and Models - -PaddleOCR will add cutting-edge OCR algorithms and models continuously. Check out the supported models and tutorials by clicking the following list: - - -- [text detection algorithms](./algorithm_overview_en.md#11) -- [text recognition algorithms](./algorithm_overview_en.md#12) -- [end-to-end algorithms](./algorithm_overview_en.md#2) -- [table recognition algorithms](./algorithm_overview_en.md#3) - -Developers are welcome to contribute more algorithms! Please refer to [add new algorithm](./add_new_algorithm_en.md) guideline. diff --git a/doc/doc_en/algorithm_overview_en.md b/doc/doc_en/algorithm_overview_en.md index 3f59bf9c829920fb43fa7f89858b4586ceaac26f..5bf569e3e1649cfabbe196be7e1a55d1caa3bf61 100755 --- a/doc/doc_en/algorithm_overview_en.md +++ b/doc/doc_en/algorithm_overview_en.md @@ -7,7 +7,11 @@ - [3. Table Recognition Algorithms](#3) - [4. Key Information Extraction Algorithms](#4) -This tutorial lists the OCR algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md). +This tutorial lists the OCR algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCRv3 models list](./models_list_en.md). + +>> +Developers are welcome to contribute more algorithms! Please refer to [add new algorithm](./add_new_algorithm_en.md) guideline. + diff --git a/ppstructure/README.md b/ppstructure/README.md index 66df10b2ec4d52fb743c40893d5fc5aa7d6ab5be..fb3697bc1066262833ee20bcbb8f79833f264f14 100644 --- a/ppstructure/README.md +++ b/ppstructure/README.md @@ -1,120 +1,115 @@ English | [简体中文](README_ch.md) - [1. Introduction](#1-introduction) -- [2. Update log](#2-update-log) -- [3. Features](#3-features) -- [4. Results](#4-results) - - [4.1 Layout analysis and table recognition](#41-layout-analysis-and-table-recognition) - - [4.2 KIE](#42-kie) -- [5. Quick start](#5-quick-start) -- [6. PP-Structure System](#6-pp-structure-system) - - [6.1 Layout analysis and table recognition](#61-layout-analysis-and-table-recognition) - - [6.1.1 Layout analysis](#611-layout-analysis) - - [6.1.2 Table recognition](#612-table-recognition) - - [6.2 KIE](#62-kie) -- [7. Model List](#7-model-list) - - [7.1 Layout analysis model](#71-layout-analysis-model) - - [7.2 OCR and table recognition model](#72-ocr-and-table-recognition-model) - - [7.3 KIE model](#73-kie-model) +- [2. Features](#2-features) +- [3. Results](#3-results) + - [3.1 Layout analysis and table recognition](#31-layout-analysis-and-table-recognition) + - [3.2 Layout Recovery](#32-layout-recovery) + - [3.3 KIE](#33-kie) +- [4. Quick start](#4-quick-start) +- [5. Model List](#5-model-list) ## 1. Introduction -PP-Structure is an OCR toolkit that can be used for document analysis and processing with complex structures, designed to help developers better complete document understanding tasks +PP-Structure is an intelligent document analysis system developed by the PaddleOCR team, which aims to help developers better complete tasks related to document understanding such as layout analysis and table recognition. -## 2. Update log -* 2022.02.12 KIE add LayoutLMv2 model。 -* 2021.12.07 add [KIE SER and RE tasks](kie/README.md)。 +The pipeline of PP-Structurev2 system is shown below. The document image first passes through the image direction correction module to identify the direction of the entire image and complete the direction correction. Then, two tasks of layout information analysis and key information extraction can be completed. -## 3. Features +- In the layout analysis task, the image first goes through the layout analysis model to divide the image into different areas such as text, table, and figure, and then analyze these areas separately. For example, the table area is sent to the form recognition module for structured recognition, and the text area is sent to the OCR engine for text recognition. Finally, the layout recovery module restores it to a word or pdf file with the same layout as the original image; +- In the key information extraction task, the OCR engine is first used to extract the text content, and then the SER(semantic entity recognition) module obtains the semantic entities in the image, and finally the RE(relationship extraction) module obtains the correspondence between the semantic entities, thereby extracting the required key information. + -The main features of PP-Structure are as follows: +More technical details: 👉 [PP-Structurev2 Technical Report](docs/PP-Structurev2_introduction.md) -- Support the layout analysis of documents, divide the documents into 5 types of areas **text, title, table, image and list** (conjunction with Layout-Parser) -- Support to extract the texts from the text, title, picture and list areas (used in conjunction with PP-OCR) -- Support to extract excel files from the table areas -- Support python whl package and command line usage, easy to use -- Support custom training for layout analysis and table structure tasks -- Support Document Key Information Extraction (KIE) tasks: Semantic Entity Recognition (SER) and Relation Extraction (RE) +PP-Structurev2 supports independent use or flexible collocation of each module. For example, you can use layout analysis alone or table recognition alone. Click the corresponding link below to get the tutorial for each independent module: -## 4. Results +- [Layout Analysis](layout/README.md) +- [Table Recognition](table/README.md) +- [Key Information Extraction](kie/README.md) +- [Layout Recovery](recovery/README.md) -### 4.1 Layout analysis and table recognition +## 2. Features - - -The figure shows the pipeline of layout analysis + table recognition. The image is first divided into four areas of image, text, title and table by layout analysis, and then OCR detection and recognition is performed on the three areas of image, text and title, and the table is performed table recognition, where the image will also be stored for use. - -### 4.2 KIE - -* SER -* -![](docs/kie/result_ser/zh_val_0_ser.jpg) | ![](docs/kie/result_ser/zh_val_42_ser.jpg) ----|--- - -Different colored boxes in the figure represent different categories. For xfun dataset, there are three categories: query, answer and header: +The main features of PP-Structurev2 are as follows: +- Support layout analysis of documents in the form of images/pdfs, which can be divided into areas such as **text, titles, tables, figures, formulas, etc.**; +- Support common Chinese and English **table detection** tasks; +- Support structured table recognition, and output the final result to **Excel file**; +- Support multimodal-based Key Information Extraction (KIE) tasks - **Semantic Entity Recognition** (SER) and **Relation Extraction (RE); +- Support **layout recovery**, that is, restore the document in word or pdf format with the same layout as the original image; +- Support customized training and multiple inference deployment methods such as python whl package quick start; +- Connect with the semi-automatic data labeling tool PPOCRLabel, which supports the labeling of layout analysis, table recognition, and SER. -* Dark purple: header -* Light purple: query -* Army green: answer +## 3. Results -The corresponding category and OCR recognition results are also marked at the top left of the OCR detection box. +PP-Structurev2 supports the independent use or flexible collocation of each module. For example, layout analysis can be used alone, or table recognition can be used alone. Only the visualization effects of several representative usage methods are shown here. +### 3.1 Layout analysis and table recognition -* RE - -![](docs/kie/result_re/zh_val_21_re.jpg) | ![](docs/kie/result_re/zh_val_40_re.jpg) ----|--- +The figure shows the pipeline of layout analysis + table recognition. The image is first divided into four areas of image, text, title and table by layout analysis, and then OCR detection and recognition is performed on the three areas of image, text and title, and the table is performed table recognition, where the image will also be stored for use. + +### 3.2 Layout recovery -In the figure, the red box represents the question, the blue box represents the answer, and the question and answer are connected by green lines. The corresponding category and OCR recognition results are also marked at the top left of the OCR detection box. +The following figure shows the effect of layout recovery based on the results of layout analysis and table recognition in the previous section. + -## 5. Quick start +### 3.3 KIE -Start from [Quick Installation](./docs/quickstart.md) +* SER -## 6. PP-Structure System +Different colored boxes in the figure represent different categories. -### 6.1 Layout analysis and table recognition +
+ +
-![pipeline](docs/table/pipeline.jpg) +
+ +
-In PP-Structure, the image will be divided into 5 types of areas **text, title, image list and table**. For the first 4 types of areas, directly use PP-OCR system to complete the text detection and recognition. For the table area, after the table structuring process, the table in image is converted into an Excel file with the same table style. +
+ +
-#### 6.1.1 Layout analysis +
+ +
-Layout analysis classifies image by region, including the use of Python scripts of layout analysis tools, extraction of designated category detection boxes, performance indicators, and custom training layout analysis models. For details, please refer to [document](layout/README.md). +
+ +
-#### 6.1.2 Table recognition +* RE -Table recognition converts table images into excel documents, which include the detection and recognition of table text and the prediction of table structure and cell coordinates. For detailed instructions, please refer to [document](table/README.md) +In the figure, the red box represents `Question`, the blue box represents `Answer`, and `Question` and `Answer` are connected by green lines. -### 6.2 KIE +
+ +
-Multi-modal based Key Information Extraction (KIE) methods include Semantic Entity Recognition (SER) and Relation Extraction (RE) tasks. Based on SER task, text recognition and classification in images can be completed. Based on THE RE task, we can extract the relation of the text content in the image, such as judge the problem pair. For details, please refer to [document](kie/README.md) +
+ +
-## 7. Model List +
+ +
-PP-Structure Series Model List (Updating) +
+ +
-### 7.1 Layout analysis model +## 4. Quick start -|model name|description|download|label_map| -| --- | --- | --- |--- | -| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis model trained on the PubLayNet dataset can divide image into 5 types of areas **text, title, table, picture, and list** | [PubLayNet](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) | {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}| +Start from [Quick Start](./docs/quickstart_en.md). -### 7.2 OCR and table recognition model +## 5. Model List -|model name|description|model size|download| -| --- | --- | --- | --- | -|ch_PP-OCRv3_det| [New] Lightweight model, supporting Chinese, English, multilingual text detection | 3.8M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar)| -|ch_PP-OCRv3_rec| [New] Lightweight model, supporting Chinese, English, multilingual text recognition | 12.4M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) | -|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) | +Some tasks need to use both the structured analysis models and the OCR models. For example, the table recognition task needs to use the table recognition model for structured analysis, and the OCR model to recognize the text in the table. Please select the appropriate models according to your specific needs. -### 7.3 KIE model +For structural analysis related model downloads, please refer to: +- [PP-Structure Model Zoo](./docs/models_list_en.md) -|model name|description|model size|download| -| --- | --- | --- | --- | -|ser_LayoutXLM_xfun_zhd|SER model trained on xfun Chinese dataset based on LayoutXLM|1.4G|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) | -|re_LayoutXLM_xfun_zh|RE model trained on xfun Chinese dataset based on LayoutXLM|1.4G|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) | +For OCR related model downloads, please refer to: +- [PP-OCR Model Zoo](../doc/doc_en/models_list_en.md) -If you need to use other models, you can download the model in [PPOCR model_list](../doc/doc_en/models_list_en.md) and [PPStructure model_list](./docs/models_list.md) diff --git a/ppstructure/README_ch.md b/ppstructure/README_ch.md index 6539002bfe1497853dfa11eb774cf3c453567988..87a9c625b32c32e9c7fffb8ebc9b9fdf3b2130db 100644 --- a/ppstructure/README_ch.md +++ b/ppstructure/README_ch.md @@ -21,7 +21,7 @@ PP-Structurev2系统流程图如下所示,文档图像首先经过图像矫正 - 关键信息抽取任务中,首先使用OCR引擎提取文本内容,然后由语义实体识别模块获取图像中的语义实体,最后经关系抽取模块获取语义实体之间的对应关系,从而提取需要的关键信息。 -更多技术细节:👉 [PP-Structurev2技术报告]() +更多技术细节:👉 [PP-Structurev2技术报告](docs/PP-Structurev2_introduction.md) PP-Structurev2支持各个模块独立使用或灵活搭配,如,可以单独使用版面分析,或单独使用表格识别,点击下面相应链接获取各个独立模块的使用教程: @@ -76,6 +76,14 @@ PP-Structurev2支持各个模块独立使用或灵活搭配,如,可以单独 +
+ +
+ +
+ +
+ * RE 图中红色框表示`问题`,蓝色框表示`答案`,`问题`和`答案`之间使用绿色线连接。 @@ -88,6 +96,14 @@ PP-Structurev2支持各个模块独立使用或灵活搭配,如,可以单独 +
+ +
+ +
+ +
+ ## 4. 快速体验 diff --git a/ppstructure/docs/inference_en.md b/ppstructure/docs/inference_en.md index 357e26a11f7e86a342bb3dbf24ea3c721705ae98..71019ec70f80e44bc16d2b0d07b0bb93b475b7e7 100644 --- a/ppstructure/docs/inference_en.md +++ b/ppstructure/docs/inference_en.md @@ -1,13 +1,13 @@ # Python Inference -- [1. Structure](#1) +- [1. Layout Structured Analysis](#1) - [1.1 layout analysis + table recognition](#1.1) - [1.2 layout analysis](#1.2) - [1.3 table recognition](#1.3) -- [2. KIE](#2) +- [2. Key Information Extraction](#2) -## 1. Structure +## 1. Layout Structured Analysis Go to the `ppstructure` directory ```bash @@ -70,7 +70,7 @@ python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \ After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image. -## 2. KIE +## 2. Key Information Extraction ```bash cd ppstructure diff --git a/ppstructure/docs/quickstart_en.md b/ppstructure/docs/quickstart_en.md index dbfbf43b01c94bd6f9c729f2f6edcd1dd6aee056..f0fbc86394dab00f1715f8f8fda30f3116c4fd07 100644 --- a/ppstructure/docs/quickstart_en.md +++ b/ppstructure/docs/quickstart_en.md @@ -1,7 +1,7 @@ # PP-Structure Quick Start -- [1. Install package](#1-install-package) -- [2. Use](#2-use) +- [1. Environment Preparation](#1-environment-preparation) +- [2. Quick Use](#2-quick-use) - [2.1 Use by command line](#21-use-by-command-line) - [2.1.1 image orientation + layout analysis + table recognition](#211-image-orientation--layout-analysis--table-recognition) - [2.1.2 layout analysis + table recognition](#212-layout-analysis--table-recognition) @@ -9,35 +9,56 @@ - [2.1.4 table recognition](#214-table-recognition) - [2.1.5 Key Information Extraction](#215-Key-Information-Extraction) - [2.1.6 layout recovery](#216-layout-recovery) - - [2.2 Use by code](#22-use-by-code) + - [2.2 Use by python script](#22-use-by-python-script) - [2.2.1 image orientation + layout analysis + table recognition](#221-image-orientation--layout-analysis--table-recognition) - [2.2.2 layout analysis + table recognition](#222-layout-analysis--table-recognition) - [2.2.3 layout analysis](#223-layout-analysis) - [2.2.4 table recognition](#224-table-recognition) - - [2.2.5 DocVQA](#225-dockie) - [2.2.5 Key Information Extraction](#225-Key-Information-Extraction) - [2.2.6 layout recovery](#226-layout-recovery) - [2.3 Result description](#23-result-description) - [2.3.1 layout analysis + table recognition](#231-layout-analysis--table-recognition) - [2.3.2 Key Information Extraction](#232-Key-Information-Extraction) - [2.4 Parameter Description](#24-parameter-description) +- [3. Summary](#3-summary) -## 1. Install package +## 1. Environment Preparation +### 1.1 Install PaddlePaddle + +> If you do not have a Python environment, please refer to [Environment Preparation](./environment_en.md). + +- If you have CUDA 9 or CUDA 10 installed on your machine, please run the following command to install + + ```bash + python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple + ``` + +- If you have no available GPU on your machine, please run the following command to install the CPU version + + ```bash + python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple + ``` + +For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation. + +### 1.2 Install PaddleOCR Whl Package ```bash # Install paddleocr, version 2.6 is recommended pip3 install "paddleocr>=2.6" -# Install the KIE dependency packages (if you do not use the KIE, you can skip it) -pip install -r kie/requirements.txt + # Install the image direction classification dependency package paddleclas (if you do not use the image direction classification, you can skip it) pip3 install paddleclas + +# Install the KIE dependency packages (if you do not use the KIE, you can skip it) +pip3 install -r kie/requirements.txt ``` -## 2. Use +## 2. Quick Use ### 2.1 Use by command line @@ -45,40 +66,40 @@ pip3 install paddleclas #### 2.1.1 image orientation + layout analysis + table recognition ```bash -paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --image_orientation=true +paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure --image_orientation=true ``` #### 2.1.2 layout analysis + table recognition ```bash -paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure +paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure ``` #### 2.1.3 layout analysis ```bash -paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --table=false --ocr=false +paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure --table=false --ocr=false ``` #### 2.1.4 table recognition ```bash -paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structure --layout=false +paddleocr --image_dir=ppstructure/docs/table/table.jpg --type=structure --layout=false ``` #### 2.1.5 Key Information Extraction -Please refer to: [Key Information Extraction](../kie/README.md) . +Key information extraction does not currently support use by the whl package. For detailed usage tutorials, please refer to: [Key Information Extraction](../kie/README.md). #### 2.1.6 layout recovery ```bash -paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true +paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure --recovery=true ``` -### 2.2 Use by code +### 2.2 Use by python script #### 2.2.1 image orientation + layout analysis + table recognition @@ -91,7 +112,7 @@ from paddleocr import PPStructure,draw_structure_result,save_structure_res table_engine = PPStructure(show_log=True, image_orientation=True) save_folder = './output' -img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img_path = 'ppstructure/docs/table/1.png' img = cv2.imread(img_path) result = table_engine(img) save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0]) @@ -102,7 +123,7 @@ for line in result: from PIL import Image -font_path = 'PaddleOCR/doc/fonts/simfang.ttf' # PaddleOCR下提供字体包 +font_path = 'doc/fonts/simfang.ttf' # PaddleOCR下提供字体包 image = Image.open(img_path).convert('RGB') im_show = draw_structure_result(image, result,font_path=font_path) im_show = Image.fromarray(im_show) @@ -120,7 +141,7 @@ from paddleocr import PPStructure,draw_structure_result,save_structure_res table_engine = PPStructure(show_log=True) save_folder = './output' -img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img_path = 'ppstructure/docs/table/1.png' img = cv2.imread(img_path) result = table_engine(img) save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0]) @@ -131,7 +152,7 @@ for line in result: from PIL import Image -font_path = 'PaddleOCR/doc/fonts/simfang.ttf' # font provieded in PaddleOCR +font_path = 'doc/fonts/simfang.ttf' # font provieded in PaddleOCR image = Image.open(img_path).convert('RGB') im_show = draw_structure_result(image, result,font_path=font_path) im_show = Image.fromarray(im_show) @@ -149,7 +170,7 @@ from paddleocr import PPStructure,save_structure_res table_engine = PPStructure(table=False, ocr=False, show_log=True) save_folder = './output' -img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img_path = 'ppstructure/docs/table/1.png' img = cv2.imread(img_path) result = table_engine(img) save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0]) @@ -170,7 +191,7 @@ from paddleocr import PPStructure,save_structure_res table_engine = PPStructure(layout=False, show_log=True) save_folder = './output' -img_path = 'PaddleOCR/ppstructure/docs/table/table.jpg' +img_path = 'ppstructure/docs/table/table.jpg' img = cv2.imread(img_path) result = table_engine(img) save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0]) @@ -183,7 +204,7 @@ for line in result: #### 2.2.5 Key Information Extraction -Please refer to: [Key Information Extraction](../kie/README.md) . +Key information extraction does not currently support use by the whl package. For detailed usage tutorials, please refer to: [Key Information Extraction](../kie/README.md). #### 2.2.6 layout recovery @@ -197,7 +218,7 @@ from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, table_engine = PPStructure(layout=False, show_log=True) save_folder = './output' -img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img_path = 'ppstructure/docs/table/1.png' img = cv2.imread(img_path) result = table_engine(img) save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0]) @@ -231,8 +252,8 @@ Each field in dict is described as follows: | field | description | | --- |---| -|type| Type of image area. | -|bbox| The coordinates of the image area in the original image, respectively [upper left corner x, upper left corner y, lower right corner x, lower right corner y]. | +|type| Type of image area. | +|bbox| The coordinates of the image area in the original image, respectively [upper left corner x, upper left corner y, lower right corner x, lower right corner y]. | |res| OCR or table recognition result of the image area.
table: a dict with field descriptions as follows:
        `html`: html str of table.
        In the code usage mode, set return_ocr_result_in_table=True whrn call can get the detection and recognition results of each text in the table area, corresponding to the following fields:
        `boxes`: text detection boxes.
        `rec_res`: text recognition results.
OCR: A tuple containing the detection boxes and recognition results of each single text. | After the recognition is completed, each image will have a directory with the same name under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. @@ -276,3 +297,8 @@ Please refer to: [Key Information Extraction](../kie/README.md) . | structure_version | Structure version, optional PP-structure and PP-structurev2 | PP-structure | Most of the parameters are consistent with the PaddleOCR whl package, see [whl package documentation](../../doc/doc_en/whl.md) + + +## 3. Summary + +Through the content in this section, you can master the use of PP-Structure related functions through PaddleOCR whl package. Please refer to [documentation tutorial](../../README.md) for more detailed usage tutorials including model training, inference and deployment, etc. \ No newline at end of file