inference_en.md 5.8 KB
Newer Older
文幕地方's avatar
文幕地方 已提交
1
# Python Inference
M
update  
MissPenguin 已提交
2

文幕地方's avatar
文幕地方 已提交
3 4 5 6 7 8 9
- [1. Layout Structured Analysis](#1-layout-structured-analysis)
  - [1.1 layout analysis + table recognition](#11-layout-analysis--table-recognition)
  - [1.2 layout analysis](#12-layout-analysis)
  - [1.3 table recognition](#13-table-recognition)
- [2. Key Information Extraction](#2-key-information-extraction)
  - [2.1 SER](#21-ser)
  - [2.2 RE+SER](#22-reser)
M
update  
MissPenguin 已提交
10 11

<a name="1"></a>
M
MissPenguin 已提交
12
## 1. Layout Structured Analysis
文幕地方's avatar
文幕地方 已提交
13
Go to the `ppstructure` directory
M
update  
MissPenguin 已提交
14 15 16

```bash
cd ppstructure
17
````
文幕地方's avatar
文幕地方 已提交
18 19 20

download model

21
```bash
M
update  
MissPenguin 已提交
22
mkdir inference && cd inference
littletomatodonkey's avatar
littletomatodonkey 已提交
23
# Download the PP-StructureV2 layout analysis model and unzip it
文幕地方's avatar
文幕地方 已提交
24 25
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
# Download the PP-OCRv3 text detection model and unzip it
文幕地方's avatar
文幕地方 已提交
26
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
文幕地方's avatar
文幕地方 已提交
27
# Download the PP-OCRv3 text recognition model and unzip it
文幕地方's avatar
文幕地方 已提交
28
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
littletomatodonkey's avatar
littletomatodonkey 已提交
29
# Download the PP-StructureV2 form recognition model and unzip it
文幕地方's avatar
文幕地方 已提交
30
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
M
update  
MissPenguin 已提交
31
cd ..
32 33
```
<a name="1.1"></a>
文幕地方's avatar
文幕地方 已提交
34
### 1.1 layout analysis + table recognition
35
```bash
文幕地方's avatar
文幕地方 已提交
36 37
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
文幕地方's avatar
文幕地方 已提交
38 39
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                          --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
40
                          --image_dir=./docs/table/1.png \
M
update  
MissPenguin 已提交
41
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
文幕地方's avatar
文幕地方 已提交
42
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
43
                          --output=../output \
M
update  
MissPenguin 已提交
44 45
                          --vis_font_path=../doc/fonts/simfang.ttf
```
文幕地方's avatar
文幕地方 已提交
46
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. Detailed results are stored in the `res.txt` file.
47 48

<a name="1.2"></a>
文幕地方's avatar
文幕地方 已提交
49
### 1.2 layout analysis
50
```bash
文幕地方's avatar
文幕地方 已提交
51 52 53 54 55
python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                          --image_dir=./docs/table/1.png \
                          --output=../output \
                          --table=false \
                          --ocr=false
56
```
文幕地方's avatar
文幕地方 已提交
57
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file
58 59

<a name="1.3"></a>
文幕地方's avatar
文幕地方 已提交
60
### 1.3 table recognition
61
```bash
文幕地方's avatar
文幕地方 已提交
62 63
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
文幕地方's avatar
文幕地方 已提交
64
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
65 66
                          --image_dir=./docs/table/table.jpg \
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
文幕地方's avatar
文幕地方 已提交
67
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
68 69 70 71
                          --output=../output \
                          --vis_font_path=../doc/fonts/simfang.ttf \
                          --layout=false
```
文幕地方's avatar
文幕地方 已提交
72
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image.
M
update  
MissPenguin 已提交
73 74

<a name="2"></a>
M
MissPenguin 已提交
75
## 2. Key Information Extraction
M
update  
MissPenguin 已提交
76

文幕地方's avatar
文幕地方 已提交
77
### 2.1 SER
M
update  
MissPenguin 已提交
78 79 80 81
```bash
cd ppstructure

mkdir inference && cd inference
littletomatodonkey's avatar
littletomatodonkey 已提交
82 83
# download model
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar
M
update  
MissPenguin 已提交
84
cd ..
文幕地方's avatar
文幕地方 已提交
85
python3 predict_system.py \
littletomatodonkey's avatar
littletomatodonkey 已提交
86
  --kie_algorithm=LayoutXLM \
文幕地方's avatar
文幕地方 已提交
87
  --ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \
littletomatodonkey's avatar
littletomatodonkey 已提交
88 89 90
  --image_dir=./docs/kie/input/zh_val_42.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
文幕地方's avatar
文幕地方 已提交
91 92
  --ocr_order_method="tb-yx" \
  --mode=kie
M
update  
MissPenguin 已提交
93
```
littletomatodonkey's avatar
littletomatodonkey 已提交
94

95
After the operation is completed, each image will store the visualized image in the `kie` directory under the directory specified by the `output` field, and the image name is the same as the input image name.
文幕地方's avatar
文幕地方 已提交
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120


### 2.2 RE+SER

```bash
cd ppstructure

mkdir inference && cd inference
# download model
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/re_vi_layoutxlm_xfund_infer.tar && tar -xf re_vi_layoutxlm_xfund_infer.tar
cd ..

python3 predict_system.py \
  --kie_algorithm=LayoutXLM \
  --re_model_dir=./inference/re_vi_layoutxlm_xfund_infer \
  --ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \
  --image_dir=./docs/kie/input/zh_val_42.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
  --ocr_order_method="tb-yx" \
  --mode=kie
```

After the operation is completed, each image will have a directory with the same name in the `kie` directory under the directory specified by the `output` field, where the visual images and prediction results are stored.