inference_en.md 4.7 KB
Newer Older
文幕地方's avatar
文幕地方 已提交
1
# Python Inference
M
update  
MissPenguin 已提交
2

M
MissPenguin 已提交
3
- [1. Layout Structured Analysis](#1)
文幕地方's avatar
文幕地方 已提交
4 5 6
  - [1.1 layout analysis + table recognition](#1.1)
  - [1.2 layout analysis](#1.2)
  - [1.3 table recognition](#1.3)
M
MissPenguin 已提交
7
- [2. Key Information Extraction](#2)
M
update  
MissPenguin 已提交
8 9

<a name="1"></a>
M
MissPenguin 已提交
10
## 1. Layout Structured Analysis
文幕地方's avatar
文幕地方 已提交
11
Go to the `ppstructure` directory
M
update  
MissPenguin 已提交
12 13 14

```bash
cd ppstructure
15
````
文幕地方's avatar
文幕地方 已提交
16 17 18

download model

19
```bash
M
update  
MissPenguin 已提交
20
mkdir inference && cd inference
文幕地方's avatar
文幕地方 已提交
21 22 23
# Download the PP-Structurev2 layout analysis model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
# Download the PP-OCRv3 text detection model and unzip it
文幕地方's avatar
文幕地方 已提交
24
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
文幕地方's avatar
文幕地方 已提交
25
# Download the PP-OCRv3 text recognition model and unzip it
文幕地方's avatar
文幕地方 已提交
26
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
文幕地方's avatar
文幕地方 已提交
27 28
# Download the PP-Structurev2 form recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
M
update  
MissPenguin 已提交
29
cd ..
30 31
```
<a name="1.1"></a>
文幕地方's avatar
文幕地方 已提交
32
### 1.1 layout analysis + table recognition
33
```bash
文幕地方's avatar
文幕地方 已提交
34 35
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
文幕地方's avatar
文幕地方 已提交
36 37
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                          --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
38
                          --image_dir=./docs/table/1.png \
M
update  
MissPenguin 已提交
39
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
文幕地方's avatar
文幕地方 已提交
40
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
41
                          --output=../output \
M
update  
MissPenguin 已提交
42 43
                          --vis_font_path=../doc/fonts/simfang.ttf
```
文幕地方's avatar
文幕地方 已提交
44
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. Detailed results are stored in the `res.txt` file.
45 46

<a name="1.2"></a>
文幕地方's avatar
文幕地方 已提交
47
### 1.2 layout analysis
48
```bash
文幕地方's avatar
文幕地方 已提交
49 50 51 52 53
python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                          --image_dir=./docs/table/1.png \
                          --output=../output \
                          --table=false \
                          --ocr=false
54
```
文幕地方's avatar
文幕地方 已提交
55
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file
56 57

<a name="1.3"></a>
文幕地方's avatar
文幕地方 已提交
58
### 1.3 table recognition
59
```bash
文幕地方's avatar
文幕地方 已提交
60 61
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
文幕地方's avatar
文幕地方 已提交
62
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
63 64
                          --image_dir=./docs/table/table.jpg \
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
文幕地方's avatar
文幕地方 已提交
65
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
66 67 68 69
                          --output=../output \
                          --vis_font_path=../doc/fonts/simfang.ttf \
                          --layout=false
```
文幕地方's avatar
文幕地方 已提交
70
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image.
M
update  
MissPenguin 已提交
71 72

<a name="2"></a>
M
MissPenguin 已提交
73
## 2. Key Information Extraction
M
update  
MissPenguin 已提交
74 75 76 77 78

```bash
cd ppstructure

mkdir inference && cd inference
littletomatodonkey's avatar
littletomatodonkey 已提交
79 80
# download model
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar
M
update  
MissPenguin 已提交
81
cd ..
littletomatodonkey's avatar
littletomatodonkey 已提交
82 83 84 85 86 87 88
python3 kie/predict_kie_token_ser.py \
  --kie_algorithm=LayoutXLM \
  --ser_model_dir=../inference/ser_vi_layoutxlm_xfund_infer \
  --image_dir=./docs/kie/input/zh_val_42.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
  --ocr_order_method="tb-yx"
M
update  
MissPenguin 已提交
89
```
littletomatodonkey's avatar
littletomatodonkey 已提交
90

91
After the operation is completed, each image will store the visualized image in the `kie` directory under the directory specified by the `output` field, and the image name is the same as the input image name.