api.md 3.0 KB
Newer Older
W
WenmuZhou 已提交
1 2
# PaddleStructure

文幕地方's avatar
文幕地方 已提交
3 4 5 6 7 8
install layoutparser
```sh
wget  https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip3 install layoutparser-0.0.0-py3-none-any.whl
```

W
WenmuZhou 已提交
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
## 1. Introduction to pipeline

PaddleStructure is a toolkit for complex layout text OCR, the process is as follows

![pipeline](../doc/table/pipeline.png)

In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, and the OCR process will be carried out according to the category.

Currently layoutparser will output five categories:
1. Text
2. Title
3. Figure
4. List
5. Table
   
Types 1-4 follow the traditional OCR process, and 5 follow the Table OCR process.

## 2. LayoutParser


## 3. Table OCR

[doc](table/README.md)

W
opt doc  
WenmuZhou 已提交
33
## 4. Predictive by inference engine
W
WenmuZhou 已提交
34

W
opt doc  
WenmuZhou 已提交
35 36 37 38 39 40 41 42 43
Use the following commands to complete the inference
```python
python3 table/predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
```
After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel, and the excel file name will be the coordinates of the table in the image.

## 5. PaddleStructure whl package introduction

### 5.1 Use
W
WenmuZhou 已提交
44

W
opt doc  
WenmuZhou 已提交
45
5.1.1 Use by code
W
WenmuZhou 已提交
46
```python
W
WenmuZhou 已提交
47
import os
W
WenmuZhou 已提交
48
import cv2
W
WenmuZhou 已提交
49
from paddlestructure import PaddleStructure,draw_result,save_res
W
WenmuZhou 已提交
50

W
WenmuZhou 已提交
51
table_engine = PaddleStructure(show_log=True)
W
WenmuZhou 已提交
52

W
WenmuZhou 已提交
53
save_folder = './output/table'
W
WenmuZhou 已提交
54 55 56
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
W
WenmuZhou 已提交
57 58
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

W
WenmuZhou 已提交
59 60 61 62 63 64 65 66 67 68 69 70
for line in result:
    print(line)

from PIL import Image

font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

W
opt doc  
WenmuZhou 已提交
71
5.1.2 Use by command line
W
WenmuZhou 已提交
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
```bash
paddlestructure --image_dir=../doc/table/1.png
```

### 参数说明
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)

| 字段                    | 说明                                            | 默认值           |
|------------------------|------------------------------------------------------|------------------|
| output                 | excel和识别结果保存的地址                    | ./output/table            |
| structure_max_len      |  structure模型预测时,图像的长边resize尺度             |  488            |
| structure_model_dir      |  structure inference 模型地址             |  None            |
| structure_char_type      |  structure 模型所用字典地址             |  ../ppocr/utils/dict/table_structure_dict.tx            |