The figure shows the pipeline of layout analysis + table recognition. The image is first divided into four areas of image, text, title and table by layout analysis, and then OCR detection and recognition is performed on the three areas of image, text and title, and the table is performed table recognition, where the image will also be stored for use.
...
...
@@ -48,7 +48,7 @@ The figure shows the pipeline of layout analysis + table recognition. The image
@@ -76,7 +76,7 @@ Start from [Quick Installation](./docs/quickstart.md)
### 6.1 Layout analysis and table recognition
![pipeline](../doc/table/pipeline.jpg)
![pipeline](docs/table/pipeline.jpg)
In PP-Structure, the image will be divided into 5 types of areas **text, title, image list and table**. For the first 4 types of areas, directly use PP-OCR system to complete the text detection and recognition. For the table area, after the table structuring process, the table in image is converted into an Excel file with the same table style.