README.md 3.1 KB
Newer Older
W
WenmuZhou 已提交
1 2
# PaddleStructure

文幕地方's avatar
文幕地方 已提交
3 4
install layoutparser
```sh
W
WenmuZhou 已提交
5
pip3 install https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
文幕地方's avatar
文幕地方 已提交
6 7
```

W
WenmuZhou 已提交
8 9 10 11
## 1. Introduction to pipeline

PaddleStructure is a toolkit for complex layout text OCR, the process is as follows

W
WenmuZhou 已提交
12
![pipeline](../doc/table/pipeline.jpg)
W
WenmuZhou 已提交
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, and the OCR process will be carried out according to the category.

Currently layoutparser will output five categories:
1. Text
2. Title
3. Figure
4. List
5. Table
   
Types 1-4 follow the traditional OCR process, and 5 follow the Table OCR process.

## 2. LayoutParser


## 3. Table OCR

[doc](table/README.md)

W
opt doc  
WenmuZhou 已提交
32
## 4. Predictive by inference engine
W
WenmuZhou 已提交
33

W
opt doc  
WenmuZhou 已提交
34 35
Use the following commands to complete the inference
```python
W
WenmuZhou 已提交
36
python3 predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
W
opt doc  
WenmuZhou 已提交
37 38 39 40 41 42
```
After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel, and the excel file name will be the coordinates of the table in the image.

## 5. PaddleStructure whl package introduction

### 5.1 Use
W
WenmuZhou 已提交
43

W
opt doc  
WenmuZhou 已提交
44
5.1.1 Use by code
W
WenmuZhou 已提交
45
```python
W
WenmuZhou 已提交
46
import os
W
WenmuZhou 已提交
47
import cv2
W
WenmuZhou 已提交
48
from paddlestructure import PaddleStructure,draw_result,save_res
W
WenmuZhou 已提交
49

W
WenmuZhou 已提交
50
table_engine = PaddleStructure(show_log=True)
W
WenmuZhou 已提交
51

W
WenmuZhou 已提交
52
save_folder = './output/table'
W
WenmuZhou 已提交
53 54 55
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
W
WenmuZhou 已提交
56 57
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

W
WenmuZhou 已提交
58 59 60 61 62
for line in result:
    print(line)

from PIL import Image

W
WenmuZhou 已提交
63
font_path = 'path/to/PaddleOCR/doc/fonts/simfang.ttf'
W
WenmuZhou 已提交
64 65 66 67 68 69
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

W
opt doc  
WenmuZhou 已提交
70
5.1.2 Use by command line
W
WenmuZhou 已提交
71 72 73 74
```bash
paddlestructure --image_dir=../doc/table/1.png
```

W
opt doc  
WenmuZhou 已提交
75 76
### Parameter Description
Most of the parameters are consistent with the paddleocr whl package, see [whl package documentation](../doc/doc_ch/whl.md)
W
WenmuZhou 已提交
77

W
opt doc  
WenmuZhou 已提交
78
| Parameter                    | Description                                            | Default           |
W
WenmuZhou 已提交
79
|------------------------|------------------------------------------------------|------------------|
W
opt doc  
WenmuZhou 已提交
80 81 82 83
| output                 | The path where excel and recognition results are saved                    | ./output/table            |
| structure_max_len      |  When the table structure model predicts, the long side of the image is resized             |  488            |
| structure_model_dir      |  Table structure inference model path             |  None            |
| structure_char_type      | Dictionary path used by table structure model             |  ../ppocr/utils/dict/table_structure_dict.tx            |
W
WenmuZhou 已提交
84 85