diff --git a/README.md b/README.md index 06343f5d0bbe7479e2207d88a6e29f8e7fb4215e..d27d22b75fa006f2f210246a0865838fe3b711a2 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools **Recent updates** - 2021.12.21 OCR open source online course starts. The lesson starts at 8:30 every night and lasts for ten days. Free registration: https://aistudio.baidu.com/aistudio/course/introduce/25207 -- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR) and 3 DocVQA algorithms (LayoutLM、LayoutLMv2,LayoutXLM). +- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR, [tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md)) and 3 DocVQA algorithms (LayoutLM, LayoutLMv2, LayoutXLM, [tutorial](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa)). - PaddleOCR R&D team would like to share the key points of PP-OCRv2, at 20:15 pm on September 8th, [Course Address](https://aistudio.baidu.com/aistudio/education/group/info/6758). - 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](#PP-OCRv2) is proposed. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile. - 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files). @@ -120,6 +120,8 @@ For a new language request, please refer to [Guideline for new language_requests - [PP-Structure: Information Extraction](./ppstructure/README.md) - [Layout Parser](./ppstructure/layout/README.md) - [Table Recognition](./ppstructure/table/README.md) + - [DocVQA](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa) + - [Key Information Extraction](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md) - Academic Circles - [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md) - [PGNet Algorithm](./doc/doc_en/pgnet_en.md) diff --git a/README_ch.md b/README_ch.md index f48ddf346b9752a0786a1d0ffadf18b069d20e27..e95595baf4d7d12e80147da1a053ffaef22cc0c1 100755 --- a/README_ch.md +++ b/README_ch.md @@ -20,7 +20,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ## 近期更新 - 2021.12.21《动手学OCR · 十讲》课程开讲,12月21日起每晚八点半线上授课![免费报名地址](https://aistudio.baidu.com/aistudio/course/introduce/25207)。 -- 2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法(PSENet),3种文本识别算法(NRTR、SEED、SAR);文档结构化算法新增1种关键信息提取算法(SDMGR),3种DocVQA算法(LayoutLM、LayoutLMv2,LayoutXLM)。 +- 2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法(PSENet),3种文本识别算法(NRTR、SEED、SAR);文档结构化算法新增1种关键信息提取算法(SDMGR,[文档](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md)),3种DocVQA算法(LayoutLM、LayoutLMv2,LayoutXLM,[文档](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa))。 - PaddleOCR研发团队对最新发版内容技术深入解读,9月8日晚上20:15,[课程回放](https://aistudio.baidu.com/aistudio/education/group/info/6758)。 - 2021.9.7 发布PaddleOCR v2.3与[PP-OCRv2](#PP-OCRv2),CPU推理速度相比于PP-OCR server提升220%;效果相比于PP-OCR mobile 提升7%。 - 2021.8.3 发布PaddleOCR v2.2,新增文档结构分析[PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README_ch.md)工具包,支持版面分析与表格识别(含Excel导出)。 @@ -89,6 +89,8 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - [PP-Structure信息提取](./ppstructure/README_ch.md) - [版面分析](./ppstructure/layout/README_ch.md) - [表格识别](./ppstructure/table/README_ch.md) + - [DocVQA](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa) + - [关键信息提取](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md) - OCR学术圈 - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md) - [端到端PGNet算法](./doc/doc_ch/pgnet.md) diff --git "a/notebook/notebook_ch/\345\246\202\344\275\225\344\275\277\347\224\250\346\234\254\344\271\246.ipynb" "b/notebook/notebook_ch/\345\246\202\344\275\225\344\275\277\347\224\250\346\234\254\344\271\246.ipynb" new file mode 100644 index 0000000000000000000000000000000000000000..f195da97540a9160ff140a627075e34e61b3719b --- /dev/null +++ "b/notebook/notebook_ch/\345\246\202\344\275\225\344\275\277\347\224\250\346\234\254\344\271\246.ipynb" @@ -0,0 +1,83 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# 1. 课程预备知识\n", + "\n", + "本课所涉及的OCR模型建立在深度学习的基础之上,因此与其相关的基础知识、环境配置、项目工程与其他资料将在本节介绍,尤其对深度学习不熟悉的读者可以查看和学习相应内容。\n", + "\n", + "### 1.1 预备知识\n", + "\n", + "深度学习的“学习”由机器学习中的神经元、感知机、多层神经网络等内容一路发展而来,因此了解基础的机器学习算法对于深度学习的理解和应用有很大帮助。而深度学习的“深”则体现在对大量信息处理过程中使用的卷积、池化等一系列以向量为基础的数学运算。如果缺乏这两者的理论基础,可以学习李宏毅老师的[线性代数](https://aistudio.baidu.com/aistudio/course/introduce/2063)和[机器学习](https://aistudio.baidu.com/aistudio/course/introduce/1978)课程。\n", + "\n", + "对于深度学习本身的理解,可以参考百度杰出架构师毕然老师的零基础课程:[百度架构师手把手带你零基础实践深度学习](https://aistudio.baidu.com/aistudio/course/introduce/1297),其中覆盖了深度学习的发展历史,通过一个经典案例介绍深度学习的完整组成部分,是一套以实践为导向的深度学习课程。\n", + "\n", + "对于理论知识的实践,[Python基础知识](https://aistudio.baidu.com/aistudio/course/introduce/1224)必不可少,同时为了快速复现深度学习模型,本课程使用的深度学习框架为:飞桨PaddlePaddle。如果你已经使用过其他框架,通过[快速上手文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/practices/quick_start/hello_paddle.html)可以迅速了解飞桨的使用方法。\n", + "\n", + "### 1.2 基础环境准备\n", + "\n", + "如果你想在本地环境运行本课程的代码且之前未搭建过Python环境,可以根据[零基础运行环境准备](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/environment.md),根据自己的操作系统安装Anaconda或docker环境。\n", + "\n", + "如果你没有本地资源,可以通过AI Studio实训平台完成代码运行,其中的每个项目都通过Notebook的方式呈现,方便开发者学习。若对Notebook的相关操作不熟悉,可以参考[AI Studio项目说明](https://ai.baidu.com/ai-doc/AISTUDIO/0k3e2tfzm)。\n", + "\n", + "### 1.3 获取和运行代码\n", + "\n", + "本课程依托PaddleOCR的代码库形成,首先,克隆PaddleOCR的完整项目:\n", + "\n", + "```bash\n", + "#【推荐】\n", + "git clone https://github.com/PaddlePaddle/PaddleOCR\n", + "\n", + "# 如果因为网络问题无法pull成功,也可选择使用码云上的托管:\n", + "git clone https://gitee.com/paddlepaddle/PaddleOCR\n", + "```\n", + "\n", + "> 注:码云托管代码可能无法实时同步本github项目更新,存在3~5天延时,请优先使用推荐方式。\n", + ">\n", + "> ​\t\t如果你不熟悉git操作,可以直接在PaddleOCR的首页的 `Code` 中下载压缩包\n", + "\n", + "然后安装第三方库:\n", + "\n", + "```\n", + "cd PaddleOCR\n", + "pip3 install -r requirements.txt\n", + "```\n", + "\n", + "\n", + "\n", + "### 1.4 查阅资料\n", + "\n", + "[PaddleOCR使用文档](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/README_ch.md#%E6%96%87%E6%A1%A3%E6%95%99%E7%A8%8B) (中文) 中详细介绍了如何使用PaddleOCR完成模型应用、训练和部署。文档内容丰富,大多数用户的问题都在文档或FAQ中有所描述,尤其在[FAQ(中文)](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/FAQ.md)中,按照深度学习的应用过程沉淀了用户的常见问题,建议大家仔细阅读。\n", + "\n", + "### 1.5 寻求帮助\n", + "\n", + "如果你在使用PaddleOCR的过程中遇到BUG、易用性或者文档相关的问题,可通过[Github issue](https://github.com/PaddlePaddle/PaddleOCR/issues)与官方联系,请按照issue模板尽可能多的提供信息,以便官方人员迅速定位问题。同时,微信群是广大PaddleOCR用户的日常交流阵地,更适合提问一些咨询类问题,除了有PaddleOCR团队成员以外,还会有热心开发者回答大家的问题。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "py35-paddle1.2.0" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.4" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +}