diff --git a/doc/table/pipeline.jpg b/doc/table/pipeline.jpg index 238b666ceb2090df9b2e10fe53d074dec3e381b3..8cea262149199e010450b69d3323b9b06e40c773 100644 Binary files a/doc/table/pipeline.jpg and b/doc/table/pipeline.jpg differ diff --git a/doc/table/pipeline_en.jpg b/doc/table/pipeline_en.jpg new file mode 100644 index 0000000000000000000000000000000000000000..2e4d1a03546308ff79f4dfb6b67e8e83420951c5 Binary files /dev/null and b/doc/table/pipeline_en.jpg differ diff --git a/ppstructure/README.md b/ppstructure/README.md index d54c835df4fc85e23206a60adc4d5c8262db37a4..edd106a27149c8e10ee898f561132e8477af39ae 100644 --- a/ppstructure/README.md +++ b/ppstructure/README.md @@ -74,7 +74,7 @@ After running, each image will have a directory with the same name under the dir ## 2. PaddleStructure Pipeline the process is as follows -![pipeline](../doc/table/pipeline.jpg) +![pipeline](../doc/table/pipeline_en.jpg) In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, including **text, title, image, list and table** 5 categories. For the first 4 types of areas, directly use the PP-OCR to complete the text detection and recognition. The table area will be converted to an excel file of the same table style via Table OCR.