inference_en.md 4.2 KB
Newer Older
文幕地方's avatar
文幕地方 已提交
1
# Python Inference
M
update  
MissPenguin 已提交
2

3
- [1. Structure](#1)
文幕地方's avatar
文幕地方 已提交
4 5 6
  - [1.1 layout analysis + table recognition](#1.1)
  - [1.2 layout analysis](#1.2)
  - [1.3 table recognition](#1.3)
7
- [2. DocVQA](#2)
M
update  
MissPenguin 已提交
8 9

<a name="1"></a>
10
## 1. Structure
文幕地方's avatar
文幕地方 已提交
11
Go to the `ppstructure` directory
M
update  
MissPenguin 已提交
12 13 14

```bash
cd ppstructure
15
````
文幕地方's avatar
文幕地方 已提交
16 17 18

download model

19
```bash
M
update  
MissPenguin 已提交
20
mkdir inference && cd inference
文幕地方's avatar
文幕地方 已提交
21
# Download the PP-OCRv2 text detection model and unzip it
M
update  
MissPenguin 已提交
22
wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar && tar xf ch_PP-OCRv2_det_slim_quant_infer.tar
文幕地方's avatar
文幕地方 已提交
23
# Download the PP-OCRv2 text recognition model and unzip it
M
update  
MissPenguin 已提交
24
wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar
文幕地方's avatar
文幕地方 已提交
25
# Download the ultra-lightweight English table structure model and unzip it
M
update  
MissPenguin 已提交
26 27
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar
cd ..
28 29
```
<a name="1.1"></a>
文幕地方's avatar
文幕地方 已提交
30
### 1.1 layout analysis + table recognition
31
```bash
M
update  
MissPenguin 已提交
32 33 34
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
35
                          --image_dir=./docs/table/1.png \
M
update  
MissPenguin 已提交
36 37
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
38
                          --output=../output \
M
update  
MissPenguin 已提交
39 40
                          --vis_font_path=../doc/fonts/simfang.ttf
```
文幕地方's avatar
文幕地方 已提交
41
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. Detailed results are stored in the `res.txt` file.
42 43

<a name="1.2"></a>
文幕地方's avatar
文幕地方 已提交
44
### 1.2 layout analysis
45 46 47
```bash
python3 predict_system.py --image_dir=./docs/table/1.png --table=false --ocr=false --output=../output/
```
文幕地方's avatar
文幕地方 已提交
48
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file
49 50

<a name="1.3"></a>
文幕地方's avatar
文幕地方 已提交
51
### 1.3 table recognition
52 53 54 55 56 57 58 59 60 61 62
```bash
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
                          --image_dir=./docs/table/table.jpg \
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
                          --output=../output \
                          --vis_font_path=../doc/fonts/simfang.ttf \
                          --layout=false
```
文幕地方's avatar
文幕地方 已提交
63
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image.
M
update  
MissPenguin 已提交
64 65 66 67 68 69 70

<a name="2"></a>
## 2. DocVQA

```bash
cd ppstructure

文幕地方's avatar
文幕地方 已提交
71
# download model
M
update  
MissPenguin 已提交
72 73 74 75 76 77 78 79 80
mkdir inference && cd inference
wget https://paddleocr.bj.bcebos.com/pplayout/PP-Layout_v1.0_ser_pretrained.tar && tar xf PP-Layout_v1.0_ser_pretrained.tar
cd ..

python3 predict_system.py --model_name_or_path=vqa/PP-Layout_v1.0_ser_pretrained/ \
                          --mode=vqa \
                          --image_dir=vqa/images/input/zh_val_0.jpg  \
                          --vis_font_path=../doc/fonts/simfang.ttf
```
文幕地方's avatar
文幕地方 已提交
81
After the operation is completed, each image will store the visualized image in the `vqa` directory under the directory specified by the `output` field, and the image name is the same as the input image name.