api.md 2.3 KB
Newer Older
W
WenmuZhou 已提交
1 2
# PaddleStructure

文幕地方's avatar
文幕地方 已提交
3 4 5 6 7 8
install layoutparser
```sh
wget  https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip3 install layoutparser-0.0.0-py3-none-any.whl
```

W
WenmuZhou 已提交
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
## 1. Introduction to pipeline

PaddleStructure is a toolkit for complex layout text OCR, the process is as follows

![pipeline](../doc/table/pipeline.png)

In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, and the OCR process will be carried out according to the category.

Currently layoutparser will output five categories:
1. Text
2. Title
3. Figure
4. List
5. Table
   
Types 1-4 follow the traditional OCR process, and 5 follow the Table OCR process.

## 2. LayoutParser


## 3. Table OCR

[doc](table/README.md)

## 4. PaddleStructure whl package introduction

### 4.1 Use

4.1.1 Use by code
```python
W
WenmuZhou 已提交
39
import os
W
WenmuZhou 已提交
40
import cv2
W
WenmuZhou 已提交
41
from paddlestructure import PaddleStructure,draw_result,save_res
W
WenmuZhou 已提交
42

W
WenmuZhou 已提交
43
table_engine = PaddleStructure(show_log=True)
W
WenmuZhou 已提交
44

W
WenmuZhou 已提交
45
save_folder = './output/table'
W
WenmuZhou 已提交
46 47 48
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
W
WenmuZhou 已提交
49 50
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

W
WenmuZhou 已提交
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
for line in result:
    print(line)

from PIL import Image

font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

4.1.2 Use by command line
```bash
paddlestructure --image_dir=../doc/table/1.png
```

### 参数说明
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)

| 字段                    | 说明                                            | 默认值           |
|------------------------|------------------------------------------------------|------------------|
| output                 | excel和识别结果保存的地址                    | ./output/table            |
| structure_max_len      |  structure模型预测时,图像的长边resize尺度             |  488            |
| structure_model_dir      |  structure inference 模型地址             |  None            |
| structure_char_type      |  structure 模型所用字典地址             |  ../ppocr/utils/dict/table_structure_dict.tx            |