{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. PP-StructureV2 Introduction\n", "\n", "PP-StructureV2 is further improved on the basis of PP-StructureV1, mainly in the following three aspects:\n", "\n", " * **System function upgrade**: Added image correction and layout restoration modules, image conversion to word/pdf, and key information extraction capabilities!\n", " * **System performance optimization** :\n", "\t * Layout analysis: released a lightweight layout analysis model, the speed is increased by **11 times**, and the average CPU time is only **41ms**!\n", "\t * Table recognition: three optimization strategies are designed, and the model accuracy is improved by **6%** when the prediction time is constant.\n", "\t * Key information extraction: designing a visually irrelevant model structure, the accuracy of semantic entity recognition is improved by **2.8%**, and the accuracy of relation extraction is improved by **9.1%**.\n", " * **Chinese scene adaptation**: Complete the Chinese scene adaptation for layout analysis and table recognition, open source **out-of-the-box** Chinese scene layout structure model!\n", "\n", "The PP-StructureV2 framework is shown in the figure below. Firstly, the input document image direction is corrected by the Image Direction Correction module. For the Layout Information Extraction subsystem, as shown in the upper branch, the corrected image is firstly divided into different areas such as text, table and image through the layout analysis module, and then these areas are recognized respectively. For example, the table area is sent to the table recognition module for structural recognition, and the text area is sent to the OCR engine for text recognition. Finally, the layout recovery module is used to restore the image to an editable Word file consistent with the original image layout. For the Key Information Extraction subsystem, as shown in the lower branch, OCR engine is used to extract the text content, then the Semantic Entity Recognition module and Relation Extraction module are used to obtain the entities and their relationship in the image, respectively, so as to extract the required key information.\n", "\n", "