diff --git a/doc/doc_ch/code_and_doc.md b/doc/doc_ch/code_and_doc.md new file mode 100644 index 0000000000000000000000000000000000000000..b1d8b4b36bd45fc1574b5049ce9af808a00b7574 --- /dev/null +++ b/doc/doc_ch/code_and_doc.md @@ -0,0 +1,324 @@ +# 附录 + +本附录包含了Python、文档规范以及Pull Request流程,请各位开发者遵循相关内容 + +- [附录1:Python代码规范](#附录1) + +- [附录2:文档规范](#附录2) + +- [附录3:Pull Request说明](#附录3) + + + +## 附录1:Python代码规范 + +PaddleOCR的Python代码遵循 [PEP8规范](https://www.python.org/dev/peps/pep-0008/),其中一些关注的重点包括如下内容 + +- 空格 + + - 空格应该加在逗号、分号、冒号前,而非他们的后面 + + ```python + # 正确: + print(x, y) + + # 错误: + print(x , y) + ``` + + - 在函数中指定关键字参数或默认参数值时, 不要在其两侧使用空格 + + ```python + # 正确: + def complex(real, imag=0.0) + # 错误: + def complex(real, imag = 0.0) + ``` + +- 注释 + + - 行内注释:行内注释使用 `#` 号表示,在代码与 `#` 之间需要空两个空格, `#` 与注释之间应当空一个空格,例如 + + ```python + x = x + 1 # Compensate for border + ``` + + - 函数和方法:每个函数的定义后的描述应该包括以下内容: + + - 函数描述:函数的作用,输入输出的 + + - Args:每个参数的名字以及对该参数的描述 + - Returns:返回值的含义和类型 + + ```python + def fetch_bigtable_rows(big_table, keys, other_silly_variable=None): + """Fetches rows from a Bigtable. + + Retrieves rows pertaining to the given keys from the Table instance + represented by big_table. Silly things may happen if + other_silly_variable is not None. + + Args: + big_table: An open Bigtable Table instance. + keys: A sequence of strings representing the key of each table row + to fetch. + other_silly_variable: Another optional variable, that has a much + longer name than the other args, and which does nothing. + + Returns: + A dict mapping keys to the corresponding table row data + fetched. Each row is represented as a tuple of strings. For + example: + + {'Serak': ('Rigel VII', 'Preparer'), + 'Zim': ('Irk', 'Invader'), + 'Lrrr': ('Omicron Persei 8', 'Emperor')} + + If a key from the keys argument is missing from the dictionary, + then that row was not found in the table. + """ + pass + ``` + + + +## 附录2:文档规范 + +### 2.1 总体说明 + +- 文档位置:如果您增加的新功能可以补充在原有的Markdown文件中,请**不要重新新建**一个文件。如果您对添加的位置不清楚,可以先PR代码,然后在commit中询问官方人员。 + +- 新增Markdown文档名称:使用英文描述文档内容,一般由小写字母与下划线组合而成,例如 `add_new_algorithm.md` + +- 新增Markdown文档格式:目录 - 正文 - FAQ + + > 目录生成方法可以使用 [此网站](https://ecotrust-canada.github.io/markdown-toc/) 将md内容复制之后自动提取目录,然后在md文件的每个标题前添加 `` + +- 中英双语:任何对文档的改动或新增都需要分别在中文和英文文档上进行。 + +### 2.2 格式规范 + +- 标题格式:文档标题格式按照:阿拉伯数字小数点组合 - 空格 - 标题的格式(例如 `2.1 XXXX` , `2. XXXX`) + +- 代码块:通过代码块格式展示需要运行的代码,在代码块前描述命令参数的含义。例如: + + > 检测+方向分类器+识别全流程:设置方向分类器参数 `--use_angle_cls true` 后可对竖排文本进行识别。 + > + > ``` + > paddleocr --image_dir ./imgs/11.jpg --use_angle_cls true + > ``` + +- 变量引用:如果在行内引用到代码变量或命令参数,需要用行内代码表示,例如上方 `--use_angle_cls true` ,并在前后各空一格 + +- 补充说明:通过引用格式 `>` 补充说明,或对注意事项进行说明 + +- 图片:如果在说明文档中增加了图片,请规范图片的命名形式(描述图片内容),并将图片添加在 `doc/` 下 + + + +## 附录3:Pull Request说明 + +### 3.1 PaddleOCR分支说明 + +PaddleOCR未来将维护2种分支,分别为: + +- release/x.x系列分支:为稳定的发行版本分支,也是默认分支。PaddleOCR会根据功能更新情况发布新的release分支,同时适配Paddle的release版本。随着版本迭代,release/x.x系列分支会越来越多,默认维护最新版本的release分支。 +- dygraph分支:为开发分支,适配Paddle动态图的dygraph版本,主要用于开发新功能。如果有同学需要进行二次开发,请选择dygraph分支。为了保证dygraph分支能在需要的时候拉出release/x.x分支,dygraph分支的代码只能使用Paddle最新release分支中有效的api。也就是说,如果Paddle dygraph分支中开发了新的api,但尚未出现在release分支代码中,那么请不要在PaddleOCR中使用。除此之外,对于不涉及api的性能优化、参数调整、策略更新等,都可以正常进行开发。 + +PaddleOCR的历史分支,未来将不再维护。考虑到一些同学可能仍在使用,这些分支还会继续保留: + +- develop分支:这个分支曾用于静态图的开发与测试,目前兼容>=1.7版本的Paddle。如果有特殊需求,要适配旧版本的Paddle,那还可以使用这个分支,但除了修复bug外不再更新代码。 + +PaddleOCR欢迎大家向repo中积极贡献代码,下面给出一些贡献代码的基本流程。 + +### 3.2 PaddleOCR代码提交流程与规范 + +> 如果你熟悉Git使用,可以直接跳转到 [3.2.10 提交代码的一些约定](#提交代码的一些约定) + +#### 3.2.1 创建你的 `远程仓库` + +- 在PaddleOCR的 [GitHub首页](https://github.com/PaddlePaddle/PaddleOCR),点击左上角 `Fork` 按钮,在你的个人目录下创建 `远程仓库`,比如`https://github.com/{your_name}/PaddleOCR`。 + +![banner](/Users/zhulingfeng01/OCR/PaddleOCR/doc/banner.png) + +- 将 `远程仓库` Clone到本地 + +``` +# 拉取develop分支的代码 +git clone https://github.com/{your_name}/PaddleOCR.git -b dygraph +cd PaddleOCR +``` + +> 多数情况下clone失败是由于网络原因,请稍后重试或配置代理 + +#### 3.2.2 和 `远程仓库` 建立连接 + +首先查看当前 `远程仓库` 的信息。 + +``` +git remote -v +# origin https://github.com/{your_name}/PaddleOCR.git (fetch) +# origin https://github.com/{your_name}/PaddleOCR.git (push) +``` + +只有clone的 `远程仓库` 的信息,也就是自己用户名下的 PaddleOCR,接下来我们创建一个原始 PaddleOCR 仓库的远程主机,命名为 upstream。 + +``` +git remote add upstream https://github.com/PaddlePaddle/PaddleOCR.git +``` + +使用 `git remote -v` 查看当前 `远程仓库` 的信息,输出如下,发现包括了origin和upstream 2个 `远程仓库` 。 + +``` +origin https://github.com/{your_name}/PaddleOCR.git (fetch) +origin https://github.com/{your_name}/PaddleOCR.git (push) +upstream https://github.com/PaddlePaddle/PaddleOCR.git (fetch) +upstream https://github.com/PaddlePaddle/PaddleOCR.git (push) +``` + +这主要是为了后续在提交pull request(PR)时,始终保持本地仓库最新。 + +#### 3.2.3 创建本地分支 + +可以基于当前分支创建新的本地分支,命令如下。 + +``` +git checkout -b new_branch +``` + +也可以基于远程或者上游的分支创建新的分支,命令如下。 + +``` +# 基于用户远程仓库(origin)的develop创建new_branch分支 +git checkout -b new_branch origin/develop +# 基于上游远程仓库(upstream)的develop创建new_branch分支 +# 如果需要从upstream创建新的分支,需要首先使用git fetch upstream获取上游代码 +git checkout -b new_branch upstream/develop +``` + +最终会显示切换到新的分支,输出信息如下 + +``` +Branch new_branch set up to track remote branch develop from upstream. +Switched to a new branch 'new_branch' +``` + +#### 3.2.4 使用pre-commit勾子 + +Paddle 开发人员使用 pre-commit 工具来管理 Git 预提交钩子。 它可以帮助我们格式化源代码(C++,Python),在提交(commit)前自动检查一些基本事宜(如每个文件只有一个 EOL,Git 中不要添加大文件等)。 + +pre-commit测试是 Travis-CI 中单元测试的一部分,不满足钩子的 PR 不能被提交到 PaddleOCR,首先安装并在当前目录运行它: + +``` +pip install pre-commit +pre-commit install +``` + + > 1. Paddle 使用 clang-format 来调整 C/C++ 源代码格式,请确保 `clang-format` 版本在 3.8 以上。 + > + > 2. 通过pip install pre-commit和conda install -c conda-forge pre-commit安装的yapf稍有不同的,PaddleOCR 开发人员使用的是 `pip install pre-commit`。 + +#### 3.2.5 修改与提交代码 + + 假设对PaddleOCR的 `README.md` 做了一些修改,可以通过 `git status` 查看改动的文件,然后使用 `git add` 添加改动文件。 + +``` +git status # 查看改动文件 +git add README.md +pre-commit +``` + +重复上述步骤,直到pre-comit格式检查不报错。如下所示。 + +[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/quick_start/community/003_precommit_pass.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/quick_start/community/003_precommit_pass.png) + +使用下面的命令完成提交。 + +``` +git commit -m "your commit info" +``` + +#### 3.2.6 保持本地仓库最新 + +获取 upstream 的最新代码并更新当前分支。这里的upstream来自于2.2节的`和远程仓库建立连接`部分。 + +``` +git fetch upstream +# 如果是希望提交到其他分支,则需要从upstream的其他分支pull代码,这里是develop +git pull upstream develop +``` + +#### 3.2.7 push到远程仓库 + +``` +git push origin new_branch +``` + +#### 3.2.7 提交Pull Request + +点击new pull request,选择本地分支和目标分支,如下图所示。在PR的描述说明中,填写该PR所完成的功能。接下来等待review,如果有需要修改的地方,参照上述步骤更新 origin 中的对应分支即可。 + +![banner](/Users/zhulingfeng01/OCR/PaddleOCR/doc/pr.png) + +#### 3.2.8 签署CLA协议和通过单元测试 + +- 签署CLA 在首次向PaddlePaddle提交Pull Request时,您需要您签署一次CLA(Contributor License Agreement)协议,以保证您的代码可以被合入,具体签署方式如下: + + 1. 请您查看PR中的Check部分,找到license/cla,并点击右侧detail,进入CLA网站 + + 2. 点击CLA网站中的“Sign in with GitHub to agree”,点击完成后将会跳转回您的Pull Request页面 + +#### 3.2.9 删除分支 + +- 删除远程分支 + + 在 PR 被 merge 进主仓库后,我们可以在 PR 的页面删除远程仓库的分支。 + + 也可以使用 `git push origin :分支名` 删除远程分支,如: + + ``` + git push origin :new_branch + ``` + +- 删除本地分支 + + ``` + # 切换到develop分支,否则无法删除当前分支 + git checkout develop + + # 删除new_branch分支 + git branch -D new_branch + ``` + + + +#### 3.2.10 提交代码的一些约定 + +为了使官方维护人员在评审代码时更好地专注于代码本身,请您每次提交代码时,遵守以下约定: + +1)请保证Travis-CI 中单元测试能顺利通过。如果没过,说明提交的代码存在问题,官方维护人员一般不做评审。 + +2)提交Pull Request前: + +- 请注意commit的数量。 + + 原因:如果仅仅修改一个文件但提交了十几个commit,每个commit只做了少量的修改,这会给评审人带来很大困扰。评审人需要逐一查看每个commit才能知道做了哪些修改,且不排除commit之间的修改存在相互覆盖的情况。 + + 建议:每次提交时,保持尽量少的commit,可以通过git commit --amend补充上次的commit。对已经Push到远程仓库的多个commit,可以参考[squash commits after push](https://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed)。 + +- 请注意每个commit的名称:应能反映当前commit的内容,不能太随意。 + + +3)如果解决了某个Issue的问题,请在该Pull Request的第一个评论框中加上:fix #issue_number,这样当该Pull Request被合并后,会自动关闭对应的Issue。关键词包括:close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved,请选择合适的词汇。详细可参考[Closing issues via commit messages](https://help.github.com/articles/closing-issues-via-commit-messages)。 + +此外,在回复评审人意见时,请您遵守以下约定: + +1)官方维护人员的每一个review意见都希望得到回复,这样会更好地提升开源社区的贡献。 + +- 对评审意见同意且按其修改完的,给个简单的Done即可; +- 对评审意见不同意的,请给出您自己的反驳理由。 + +2)如果评审意见比较多: + +- 请给出总体的修改情况。 +- 请采用`start a review`进行回复,而非直接回复的方式。原因是每个回复都会发送一封邮件,会造成邮件灾难。 \ No newline at end of file diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md index cfc9d52bf280400982a9fcd9941ddc4cce3f5e5c..f76ae7f842fb6b7002e084be59dc7ccb31f39771 100644 --- a/doc/doc_ch/detection.md +++ b/doc/doc_ch/detection.md @@ -247,3 +247,7 @@ Q1: 训练模型转inference 模型之后预测效果不一致? **A**:此类问题出现较多,问题多是trained model预测时候的预处理、后处理参数和inference model预测的时候的预处理、后处理参数不一致导致的。以det_mv3_db.yml配置文件训练的模型为例,训练模型、inference模型预测结果不一致问题解决方式如下: - 检查[trained model预处理](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/configs/det/det_mv3_db.yml#L116),和[inference model的预测预处理](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/tools/infer/predict_det.py#L42)函数是否一致。算法在评估的时候,输入图像大小会影响精度,为了和论文保持一致,训练icdar15配置文件中将图像resize到[736, 1280],但是在inference model预测的时候只有一套默认参数,会考虑到预测速度问题,默认限制图像最长边为960做resize的。训练模型预处理和inference模型的预处理函数位于[ppocr/data/imaug/operators.py](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/ppocr/data/imaug/operators.py#L147) - 检查[trained model后处理](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/configs/det/det_mv3_db.yml#L51),和[inference 后处理参数](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/tools/infer/utility.py#L50)是否一致。 + +Q1: 训练EAST模型提示找不到lanms库? + +**A**:执行pip3 install lanms-nova 即可。 diff --git a/doc/doc_ch/inference.md b/doc/doc_ch/inference.md index 4e0f1d131e2547f0d4a8bdf35c0f4a6f8bf2e7a3..c964d23117d022531d1181455a7b1c6c1d08ccae 100755 --- a/doc/doc_ch/inference.md +++ b/doc/doc_ch/inference.md @@ -34,6 +34,8 @@ inference 模型(`paddle.jit.save`保存的模型) - [1. 超轻量中文OCR模型推理](#超轻量中文OCR模型推理) - [2. 其他模型推理](#其他模型推理) +- [六、参数解释](参数解释) + ## 一、训练模型转inference模型 @@ -394,3 +396,127 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --d 执行命令后,识别结果图像如下: ![](../imgs_results/img_10_east_starnet.jpg) + + + + +# 六、参数解释 + +更多关于预测过程的参数解释如下所示。 + +* 全局信息 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| image_dir | str | 无,必须显式指定 | 图像或者文件夹路径 | +| vis_font_path | str | "./doc/fonts/simfang.ttf" | 用于可视化的字体路径 | +| drop_score | float | 0.5 | 识别得分小于该值的结果会被丢弃,不会作为返回结果 | +| use_pdserving | bool | False | 是否使用Paddle Serving进行预测 | +| warmup | bool | False | 是否开启warmup,在统计预测耗时的时候,可以使用这种方法 | +| draw_img_save_dir | str | "./inference_results" | 系统串联预测OCR结果的保存文件夹 | +| save_crop_res | bool | False | 是否保存OCR的识别文本图像 | +| crop_res_save_dir | str | "./output" | 保存OCR识别出来的文本图像路径 | +| use_mp | bool | False | 是否开启多进程预测 | +| total_process_num | int | 6 | 开启的进城数,`use_mp`为`True`时生效 | +| process_id | int | 0 | 当前进程的id号,无需自己修改 | +| benchmark | bool | False | 是否开启benchmark,对预测速度、显存占用等进行统计 | +| save_log_path | str | "./log_output/" | 开启`benchmark`时,日志结果的保存文件夹 | +| show_log | bool | True | 是否显示预测中的日志信息 | +| use_onnx | bool | False | 是否开启onnx预测 | + + +* 预测引擎相关 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| use_gpu | bool | True | 是否使用GPU进行预测 | +| ir_optim | bool | True | 是否对计算图进行分析与优化,开启后可以加速预测过程 | +| use_tensorrt | bool | False | 是否开启tensorrt | +| min_subgraph_size | int | 15 | tensorrt中最小子图size,当子图的size大于该值时,才会尝试对该子图使用trt engine计算 | +| precision | str | fp32 | 预测的精度,支持`fp32`, `fp16`, `int8` 3种输入 | +| enable_mkldnn | bool | True | 是否开启mkldnn | +| cpu_threads | int | 10 | 开启mkldnn时,cpu预测的线程数 | + +* 文本检测模型相关 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| det_algorithm | str | "DB" | 文本检测算法名称,目前支持`DB`, `EAST`, `SAST`, `PSE` | +| det_model_dir | str | xx | 检测inference模型路径 | +| det_limit_side_len | int | 960 | 检测的图像边长限制 | +| det_limit_type | str | "max" | 检测的变成限制类型,目前支持`min`, `max`,`min`表示保证图像最短边不小于`det_limit_side_len`,`max`表示保证图像最长边不大于`det_limit_side_len` | + +其中,DB算法相关参数如下 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| det_db_thresh | float | 0.3 | DB输出的概率图中,得分大于该阈值的像素点才会被认为是文字像素点 | +| det_db_box_thresh | float | 0.6 | 检测结果边框内,所有像素点的平均得分大于该阈值时,该结果会被认为是文字区域 | +| det_db_unclip_ratio | float | 1.5 | `Vatti clipping`算法的扩张系数,使用该方法对文字区域进行扩张 | +| max_batch_size | int | 10 | 预测的batch size | +| use_dilation | bool | False | 是否对分割结果进行膨胀以获取更优检测效果 | +| det_db_score_mode | str | "fast" | DB的检测结果得分计算方法,支持`fast`和`slow`,`fast`是根据polygon的外接矩形边框内的所有像素计算平均得分,`slow`是根据原始polygon内的所有像素计算平均得分,计算速度相对较慢一些,但是更加准确一些。 | + +EAST算法相关参数如下 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| det_east_score_thresh | float | 0.8 | EAST后处理中score map的阈值 | +| det_east_cover_thresh | float | 0.1 | EAST后处理中文本框的平均得分阈值 | +| det_east_nms_thresh | float | 0.2 | EAST后处理中nms的阈值 | + +SAST算法相关参数如下 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| det_sast_score_thresh | float | 0.5 | SAST后处理中的得分阈值 | +| det_sast_nms_thresh | float | 0.5 | SAST后处理中nms的阈值 | +| det_sast_polygon | bool | False | 是否多边形检测,弯曲文本场景(如Total-Text)设置为True | + +PSE算法相关参数如下 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| det_pse_thresh | float | 0.0 | 对输出图做二值化的阈值 | +| det_pse_box_thresh | float | 0.85 | 对box进行过滤的阈值,低于此阈值的丢弃 | +| det_pse_min_area | float | 16 | box的最小面积,低于此阈值的丢弃 | +| det_pse_box_type | str | "box" | 返回框的类型,box:四点坐标,poly: 弯曲文本的所有点坐标 | +| det_pse_scale | int | 1 | 输入图像相对于进后处理的图的比例,如`640*640`的图像,网络输出为`160*160`,scale为2的情况下,进后处理的图片shape为`320*320`。这个值调大可以加快后处理速度,但是会带来精度的下降 | + +* 文本识别模型相关 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| rec_algorithm | str | "CRNN" | 文本识别算法名称,目前支持`CRNN`, `SRN`, `RARE`, `NETR`, `SAR` | +| rec_model_dir | str | 无,如果使用识别模型,该项是必填项 | 识别inference模型路径 | +| rec_image_shape | list | [3, 32, 320] | 识别时的图像尺寸, | +| rec_batch_num | int | 6 | 识别的batch size | +| max_text_length | int | 25 | 识别结果最大长度,在`SRN`中有效 | +| rec_char_dict_path | str | "./ppocr/utils/ppocr_keys_v1.txt" | 识别的字符字典文件 | +| use_space_char | bool | True | 是否包含空格,如果为`True`,则会在最后字符字典中补充`空格`字符 | + + +* 端到端文本检测与识别模型相关 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| e2e_algorithm | str | "PGNet" | 端到端算法名称,目前支持`PGNet` | +| e2e_model_dir | str | 无,如果使用端到端模型,该项是必填项 | 端到端模型inference模型路径 | +| e2e_limit_side_len | int | 768 | 端到端的输入图像边长限制 | +| e2e_limit_type | str | "max" | 端到端的边长限制类型,目前支持`min`, `max`,`min`表示保证图像最短边不小于`e2e_limit_side_len`,`max`表示保证图像最长边不大于`e2e_limit_side_len` | +| e2e_pgnet_score_thresh | float | xx | xx | +| e2e_char_dict_path | str | "./ppocr/utils/ic15_dict.txt" | 识别的字典文件路径 | +| e2e_pgnet_valid_set | str | "totaltext" | 验证集名称,目前支持`totaltext`, `partvgg`,不同数据集对应的后处理方式不同,与训练过程保持一致即可 | +| e2e_pgnet_mode | str | "fast" | PGNet的检测结果得分计算方法,支持`fast`和`slow`,`fast`是根据polygon的外接矩形边框内的所有像素计算平均得分,`slow`是根据原始polygon内的所有像素计算平均得分,计算速度相对较慢一些,但是更加准确一些。 | + + +* 方向分类器模型相关 + +| 参数名称 | 类型 | 默认值 | 含义 | +| :--: | :--: | :--: | :--: | +| use_angle_cls | bool | False | 是否使用方向分类器 | +| cls_model_dir | str | 无,如果需要使用,则必须显式指定路径 | 方向分类器inference模型路径 | +| cls_image_shape | list | [3, 48, 192] | 预测尺度 | +| label_list | list | ['0', '180'] | class id对应的角度值 | +| cls_batch_num | int | 6 | 方向分类器预测的batch size | +| cls_thresh | float | 0.9 | 预测阈值,模型预测结果为180度,且得分大于该阈值时,认为最终预测结果为180度,需要翻转 | diff --git a/doc/joinus.PNG b/doc/joinus.PNG index cd9de9c14beaf0be346a1f7f1d09450a0905a880..99964b62d0e8a5867d5eb7a29640f0414c7af3b2 100644 Binary files a/doc/joinus.PNG and b/doc/joinus.PNG differ diff --git a/ppocr/postprocess/east_postprocess.py b/ppocr/postprocess/east_postprocess.py index ec6bf663854d3391bf8c584aa749dc6d1805d344..c194c81c6911aac0f9210109c37b76b44532e9c4 100755 --- a/ppocr/postprocess/east_postprocess.py +++ b/ppocr/postprocess/east_postprocess.py @@ -20,7 +20,6 @@ import numpy as np from .locality_aware_nms import nms_locality import cv2 import paddle -import lanms import os import sys @@ -61,6 +60,7 @@ class EASTPostProcess(object): """ restore text boxes from score map and geo map """ + score_map = score_map[0] geo_map = np.swapaxes(geo_map, 1, 0) geo_map = np.swapaxes(geo_map, 1, 2) @@ -76,8 +76,15 @@ class EASTPostProcess(object): boxes = np.zeros((text_box_restored.shape[0], 9), dtype=np.float32) boxes[:, :8] = text_box_restored.reshape((-1, 8)) boxes[:, 8] = score_map[xy_text[:, 0], xy_text[:, 1]] - boxes = lanms.merge_quadrangle_n9(boxes, nms_thresh) - # boxes = nms_locality(boxes.astype(np.float64), nms_thresh) + + try: + import lanms + boxes = lanms.merge_quadrangle_n9(boxes, nms_thresh) + except: + print( + 'you should install lanms by pip3 install lanms-nova to speed up nms_locality' + ) + boxes = nms_locality(boxes.astype(np.float64), nms_thresh) if boxes.shape[0] == 0: return [] # Here we filter some low score boxes by the average score map, diff --git a/ppocr/utils/save_load.py b/ppocr/utils/save_load.py index 4b890f6fa352772e6ebe1614b798e1ce69cdd17c..f6013a406634ed110ea5af613a5f31e56ce90ead 100644 --- a/ppocr/utils/save_load.py +++ b/ppocr/utils/save_load.py @@ -67,6 +67,7 @@ def load_model(config, model, optimizer=None): if key not in params: logger.warning("{} not in loaded params {} !".format( key, params.keys())) + continue pre_value = params[key] if list(value.shape) == list(pre_value.shape): new_state_dict[key] = pre_value @@ -76,9 +77,14 @@ def load_model(config, model, optimizer=None): format(key, value.shape, pre_value.shape)) model.set_state_dict(new_state_dict) - optim_dict = paddle.load(checkpoints + '.pdopt') if optimizer is not None: - optimizer.set_state_dict(optim_dict) + if os.path.exists(checkpoints + '.pdopt'): + optim_dict = paddle.load(checkpoints + '.pdopt') + optimizer.set_state_dict(optim_dict) + else: + logger.warning( + "{}.pdopt is not exists, params of optimizer is not loaded". + format(checkpoints)) if os.path.exists(checkpoints + '.states'): with open(checkpoints + '.states', 'rb') as f: diff --git a/requirements.txt b/requirements.txt index 903b8eda055573621f5d5479e85b17986b702ead..0c87c5c95069a2699f5a3a50320c883c6118ffe7 100644 --- a/requirements.txt +++ b/requirements.txt @@ -12,5 +12,4 @@ cython lxml premailer openpyxl -fasttext==0.9.1 -lanms-nova \ No newline at end of file +fasttext==0.9.1 \ No newline at end of file diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt index 2cd2ba5f1e8198cacadab653d3979d5a1662f9ea..cf4d3fde0221b2a0edaf3d0f9bde5d8ff02991da 100644 --- a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -5,12 +5,12 @@ infer_model:./inference/ch_PP-OCRv2_det_infer/ infer_export:null infer_quant:True inference:tools/infer/predict_system.py ---use_gpu:False ---enable_mkldnn:False +--use_gpu:False|True +--enable_mkldnn:False|True --cpu_threads:1|6 --rec_batch_num:1 ---use_tensorrt:False ---precision:int8 +--use_tensorrt:False|True +--precision:fp32|fp16 --det_model_dir: --image_dir:./inference/ch_det_data_50/all-sum-510/ --rec_model_dir:./inference/ch_PP-OCRv2_rec_infer/ diff --git a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt index 0ff24cbccfe282c12982714b5d079b0031703a04..1aad65b687992155133ed11533a14f642510361d 100644 --- a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -1,15 +1,17 @@ ===========================kl_quant_params=========================== model_name:PPOCRv2_ocr_det_kl python:python3.7 +Global.pretrained_model:null +Global.save_inference_dir:null infer_model:./inference/ch_PP-OCRv2_det_infer/ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o infer_quant:True inference:tools/infer/predict_det.py ---use_gpu:False ---enable_mkldnn:False +--use_gpu:False|True +--enable_mkldnn:True --cpu_threads:1|6 --rec_batch_num:1 ---use_tensorrt:False +--use_tensorrt:False|True --precision:int8 --det_model_dir: --image_dir:./inference/ch_det_data_50/all-sum-510/ diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt index 8826bb4f078d518a79748f9cb305268c5ec2c198..083a3ae26e726e290ffde4095821cbf3c40f7178 100644 --- a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -1,15 +1,17 @@ ===========================kl_quant_params=========================== model_name:PPOCRv2_ocr_rec_kl python:python3.7 +Global.pretrained_model:null +Global.save_inference_dir:null infer_model:./inference/ch_PP-OCRv2_rec_infer/ infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o infer_quant:True inference:tools/infer/predict_rec.py ---use_gpu:False ---enable_mkldnn:False +--use_gpu:False|True +--enable_mkldnn:False|True --cpu_threads:1|6 --rec_batch_num:1|6 ---use_tensorrt:False +--use_tensorrt:True --precision:int8 --rec_model_dir: --image_dir:./inference/rec_inference diff --git a/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt index 77889729e61a4b859895ee0de52c92ed258ace31..92ac3e9d37460a7f299f5cc2929a9bcaabdc34ef 100644 --- a/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt +++ b/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt @@ -4,7 +4,7 @@ python:python3.7 gpu_list:0|0,1 Global.use_gpu:True|True Global.auto_cast:null -Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300 Global.save_model_dir:./output/ Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Global.pretrained_model:null @@ -15,7 +15,7 @@ null:null trainer:fpgm_train norm_train:null pact_train:null -fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy +fpgm_train:deploy/slim/prune/sensitivity_anal.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy distill_train:null null:null null:null @@ -29,7 +29,7 @@ Global.save_inference_dir:./output/ Global.pretrained_model: norm_export:null quant_export:null -fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o +fpgm_export:deploy/slim/prune/export_prune_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o distill_export:null export1:null export2:null diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt index eea9d789dd4919fe8112d337e48b82fabacfc57a..6a023951713f2993cbec448d88cd4029919d5860 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -5,12 +5,12 @@ infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/ infer_export:null infer_quant:True inference:tools/infer/predict_system.py ---use_gpu:False ---enable_mkldnn:False +--use_gpu:False|True +--enable_mkldnn:False|True --cpu_threads:1|6 --rec_batch_num:1 ---use_tensorrt:False ---precision:int8 +--use_tensorrt:False|True +--precision:fp32|fp16 --det_model_dir: --image_dir:./inference/ch_det_data_50/all-sum-510/ --rec_model_dir:./inference/ch_ppocr_mobile_v2.0_rec_infer/ diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt index d9766d150de8b80522004f772941640438d3fdb5..977312f2a49e76d92e4edc11f8f0d3ecf866999a 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt @@ -4,7 +4,7 @@ python:python3.7 gpu_list:0|0,1 Global.use_gpu:True|True Global.auto_cast:null -Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300 Global.save_model_dir:./output/ Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Global.pretrained_model:null @@ -13,7 +13,7 @@ train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ null:null ## trainer:norm_train -norm_train:tools/train.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained +norm_train:tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained pact_train:null fpgm_train:null distill_train:null @@ -27,7 +27,7 @@ null:null ===========================infer_params=========================== Global.save_inference_dir:./output/ Global.pretrained_model: -norm_export:tools/export_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o +norm_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o quant_export:null fpgm_export:null distill_export:null diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt index 4001ca18284b703b92a6998d2218df3f003c74d3..014dad5fc9d87c08a0725f57127f8bf2cb248be3 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt @@ -4,7 +4,7 @@ python:python gpu_list:-1 Global.use_gpu:False Global.auto_cast:null -Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300 Global.save_model_dir:./output/ Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Global.pretrained_model:null @@ -12,10 +12,10 @@ train_model_name:latest train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ null:null ## -trainer:norm_train|pact_train|fpgm_train -norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained -pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o -fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy +trainer:norm_train +norm_train:tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained +pact_train:null +fpgm_train:null distill_train:null null:null null:null @@ -27,9 +27,9 @@ null:null ===========================infer_params=========================== Global.save_inference_dir:./output/ Global.pretrained_model: -norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o -quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o -fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o +norm_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o +quant_export:null +fpgm_export:null distill_export:null export1:null export2:null diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt index 0f4faee4b32925b4d0780ece6838c176238c7000..6a63b39d976c0e9693deec097c37eb0ff212d8af 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt @@ -4,7 +4,7 @@ python:python gpu_list:0 Global.use_gpu:True Global.auto_cast:fp32|amp -Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300 Global.save_model_dir:./output/ Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Global.pretrained_model:null @@ -12,10 +12,10 @@ train_model_name:latest train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ null:null ## -trainer:norm_train|pact_train|fpgm_train -norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained -pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o -fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy +trainer:norm_train +norm_train:tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained +pact_train:null +fpgm_train:null distill_train:null null:null null:null @@ -27,9 +27,9 @@ null:null ===========================infer_params=========================== Global.save_inference_dir:./output/ Global.pretrained_model: -norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o -quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o -fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o +norm_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o +quant_export:null +fpgm_export:null distill_export:null export1:null export2:null @@ -49,63 +49,4 @@ inference:tools/infer/predict_det.py null:null --benchmark:True null:null -===========================cpp_infer_params=========================== -use_opencv:True -infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/ -infer_quant:False -inference:./deploy/cpp_infer/build/ppocr det ---use_gpu:True|False ---enable_mkldnn:True|False ---cpu_threads:1|6 ---rec_batch_num:1 ---use_tensorrt:False|True ---precision:fp32|fp16 ---det_model_dir: ---image_dir:./inference/ch_det_data_50/all-sum-510/ -null:null ---benchmark:True -===========================serving_params=========================== -model_name:ocr_det -python:python3.7 -trans_model:-m paddle_serving_client.convert ---dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/ ---model_filename:inference.pdmodel ---params_filename:inference.pdiparams ---serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/ ---serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/ -serving_dir:./deploy/pdserving -web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1 -op.det.local_service_conf.devices:null|0 -op.det.local_service_conf.use_mkldnn:True|False -op.det.local_service_conf.thread_num:1|6 -op.det.local_service_conf.use_trt:False|True -op.det.local_service_conf.precision:fp32|fp16|int8 -pipline:pipeline_http_client.py|pipeline_rpc_client.py ---image_dir=../../doc/imgs -===========================kl_quant_params=========================== -infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/ -infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o -infer_quant:True -inference:tools/infer/predict_det.py ---use_gpu:True|False ---enable_mkldnn:True|False ---cpu_threads:1|6 ---rec_batch_num:1 ---use_tensorrt:False|True ---precision:int8 ---det_model_dir: ---image_dir:./inference/ch_det_data_50/all-sum-510/ -null:null ---benchmark:True -null:null -null:null -===========================lite_params=========================== -inference:./ocr_db_crnn det -infer_model:./models/ch_ppocr_mobile_v2.0_det_opt.nb|./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb ---cpu_threads:1|4 ---batch_size:1 ---power_mode:LITE_POWER_HIGH|LITE_POWER_LOW ---image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/|./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg ---config_dir:./config.txt ---rec_dict_dir:./ppocr_keys_v1.txt ---benchmark:True + diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt index bd58e964033243c00e7a270d642f97ced7659114..1039dcad06d63bb1fc1a47b7cc4760cd8d75ed63 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -1,15 +1,17 @@ ===========================kl_quant_params=========================== -model_name:PPOCRv2_ocr_det +model_name:ch_ppocr_mobile_v2.0_det_KL python:python3.7 +Global.pretrained_model:null +Global.save_inference_dir:null infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o infer_quant:True inference:tools/infer/predict_det.py ---use_gpu:False ---enable_mkldnn:False +--use_gpu:False|True +--enable_mkldnn:True --cpu_threads:1|6 --rec_batch_num:1 ---use_tensorrt:False +--use_tensorrt:False|True --precision:int8 --det_model_dir: --image_dir:./inference/ch_det_data_50/all-sum-510/ diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt index 7328be25ffd0ffa0abac83ec80e46be42ff93185..8a6c6568584250d269acfe63aef43ef66410fd99 100644 --- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt @@ -4,7 +4,7 @@ python:python3.7 gpu_list:0|0,1 Global.use_gpu:True|True Global.auto_cast:null -Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300 Global.save_model_dir:./output/ Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Global.pretrained_model:null @@ -14,7 +14,7 @@ null:null ## trainer:pact_train norm_train:null -pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o +pact_train:deploy/slim/quantization/quant.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o fpgm_train:null distill_train:null null:null @@ -28,7 +28,7 @@ null:null Global.save_inference_dir:./output/ Global.pretrained_model: norm_export:null -quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o +quant_export:deploy/slim/quantization/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o fpgm_export:null distill_export:null export1:null diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt new file mode 100644 index 0000000000000000000000000000000000000000..92f33c58c9e97347e53b778bde5a21472b769f36 --- /dev/null +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -0,0 +1,21 @@ +===========================kl_quant_params=========================== +model_name:ch_ppocr_mobile_v2.0_rec_KL +python:python3.7 +Global.pretrained_model:null +Global.save_inference_dir:null +infer_model:./inference/ch_ppocr_mobile_v2.0_rec_infer/ +infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml -o +infer_quant:True +inference:tools/infer/predict_rec.py +--use_gpu:False|True +--enable_mkldnn:True +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False|True +--precision:int8 +--det_model_dir: +--image_dir:./inference/rec_inference +null:null +--benchmark:True +null:null +null:null diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml new file mode 100644 index 0000000000000000000000000000000000000000..b06dafe7fdc01eadeee51e70dfa4e8c675bda531 --- /dev/null +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml @@ -0,0 +1,101 @@ +Global: + use_gpu: true + epoch_num: 500 + log_smooth_window: 20 + print_batch_step: 10 + save_model_dir: ./output/rec_chinese_lite_v2.0 + save_epoch_step: 3 + # evaluation is run every 5000 iterations after the 4000th iteration + eval_batch_step: [0, 2000] + cal_metric_during_train: True + pretrained_model: + checkpoints: + save_inference_dir: + use_visualdl: False + infer_img: doc/imgs_words/ch/word_1.jpg + # for data or label process + character_dict_path: ppocr/utils/ppocr_keys_v1.txt + max_text_length: 25 + infer_mode: False + use_space_char: True + save_res_path: ./output/rec/predicts_chinese_lite_v2.0.txt + + +Optimizer: + name: Adam + beta1: 0.9 + beta2: 0.999 + lr: + name: Cosine + learning_rate: 0.001 + regularizer: + name: 'L2' + factor: 0.00001 + +Architecture: + model_type: rec + algorithm: CRNN + Transform: + Backbone: + name: MobileNetV3 + scale: 0.5 + model_name: small + small_stride: [1, 2, 2, 2] + Neck: + name: SequenceEncoder + encoder_type: rnn + hidden_size: 48 + Head: + name: CTCHead + fc_decay: 0.00001 + +Loss: + name: CTCLoss + +PostProcess: + name: CTCLabelDecode + +Metric: + name: RecMetric + main_indicator: acc + +Train: + dataset: + name: SimpleDataSet + data_dir: train_data/ic15_data + label_file_list: ["train_data/ic15_data/rec_gt_train.txt"] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - RecAug: + - CTCLabelEncode: # Class handling label + - RecResizeImg: + image_shape: [3, 32, 320] + - KeepKeys: + keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order + loader: + shuffle: True + batch_size_per_card: 256 + drop_last: True + num_workers: 8 + +Eval: + dataset: + name: SimpleDataSet + data_dir: train_data/ic15_data + label_file_list: ["train_data/ic15_data/rec_gt_test.txt"] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - CTCLabelEncode: # Class handling label + - RecResizeImg: + image_shape: [3, 32, 320] + - KeepKeys: + keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order + loader: + shuffle: False + drop_last: False + batch_size_per_card: 256 + num_workers: 8 diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml new file mode 100644 index 0000000000000000000000000000000000000000..b06dafe7fdc01eadeee51e70dfa4e8c675bda531 --- /dev/null +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml @@ -0,0 +1,101 @@ +Global: + use_gpu: true + epoch_num: 500 + log_smooth_window: 20 + print_batch_step: 10 + save_model_dir: ./output/rec_chinese_lite_v2.0 + save_epoch_step: 3 + # evaluation is run every 5000 iterations after the 4000th iteration + eval_batch_step: [0, 2000] + cal_metric_during_train: True + pretrained_model: + checkpoints: + save_inference_dir: + use_visualdl: False + infer_img: doc/imgs_words/ch/word_1.jpg + # for data or label process + character_dict_path: ppocr/utils/ppocr_keys_v1.txt + max_text_length: 25 + infer_mode: False + use_space_char: True + save_res_path: ./output/rec/predicts_chinese_lite_v2.0.txt + + +Optimizer: + name: Adam + beta1: 0.9 + beta2: 0.999 + lr: + name: Cosine + learning_rate: 0.001 + regularizer: + name: 'L2' + factor: 0.00001 + +Architecture: + model_type: rec + algorithm: CRNN + Transform: + Backbone: + name: MobileNetV3 + scale: 0.5 + model_name: small + small_stride: [1, 2, 2, 2] + Neck: + name: SequenceEncoder + encoder_type: rnn + hidden_size: 48 + Head: + name: CTCHead + fc_decay: 0.00001 + +Loss: + name: CTCLoss + +PostProcess: + name: CTCLabelDecode + +Metric: + name: RecMetric + main_indicator: acc + +Train: + dataset: + name: SimpleDataSet + data_dir: train_data/ic15_data + label_file_list: ["train_data/ic15_data/rec_gt_train.txt"] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - RecAug: + - CTCLabelEncode: # Class handling label + - RecResizeImg: + image_shape: [3, 32, 320] + - KeepKeys: + keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order + loader: + shuffle: True + batch_size_per_card: 256 + drop_last: True + num_workers: 8 + +Eval: + dataset: + name: SimpleDataSet + data_dir: train_data/ic15_data + label_file_list: ["train_data/ic15_data/rec_gt_test.txt"] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - CTCLabelEncode: # Class handling label + - RecResizeImg: + image_shape: [3, 32, 320] + - KeepKeys: + keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order + loader: + shuffle: False + drop_last: False + batch_size_per_card: 256 + num_workers: 8 diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt new file mode 100644 index 0000000000000000000000000000000000000000..7bbdd58ae13eca00623123cf2ca39d3b76daa72a --- /dev/null +++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt @@ -0,0 +1,51 @@ +===========================train_params=========================== +model_name:ch_ppocr_mobile_v2.0_rec_PACT +python:python3.7 +gpu_list:0 +Global.use_gpu:True|True +Global.auto_cast:null +Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.save_model_dir:./output/ +Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128 +Global.checkpoints:null +train_model_name:latest +train_infer_img_dir:./train_data/ic15_data/test/word_1.png +null:null +## +trainer:pact_train +norm_train:null +pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml -o +fpgm_train:null +distill_train:null +null:null +null:null +## +===========================eval_params=========================== +eval:null +null:null +## +===========================infer_params=========================== +Global.save_inference_dir:./output/ +Global.checkpoints: +norm_export:null +quant_export:deploy/slim/quantization/export_model.py -ctest_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml -o +fpgm_export:null +distill_export:null +export1:null +export2:null +inference_dir:null +train_model:null +infer_export:null +infer_quant:False +inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_image_shape="3,32,100" +--use_gpu:True|False +--enable_mkldnn:True|False +--cpu_threads:1|6 +--rec_batch_num:1|6 +--use_tensorrt:False|True +--precision:fp32|fp16|int8 +--rec_model_dir: +--image_dir:./inference/rec_inference +--save_log_path:./test/output/ +--benchmark:True +null:null \ No newline at end of file diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt new file mode 100644 index 0000000000000000000000000000000000000000..5a93571a76366de191d2fb1736aa3ff4c71b1737 --- /dev/null +++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt @@ -0,0 +1,19 @@ +===========================ch_ppocr_mobile_v2.0=========================== +model_name:ch_ppocr_server_v2.0 +python:python3.7 +infer_model:./inference/ch_ppocr_server_v2.0_det_infer/ +infer_export:null +infer_quant:True +inference:tools/infer/predict_system.py +--use_gpu:False +--enable_mkldnn:False +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False +--precision:int8 +--det_model_dir: +--image_dir:./inference/ch_det_data_50/all-sum-510/ +--rec_model_dir:./inference/ch_ppocr_server_v2.0_rec_infer/ +--benchmark:True +null:null +null:null diff --git a/test_tipc/configs/det_mv3_db_v2.0/train_infer_python.txt b/test_tipc/configs/det_mv3_db_v2.0/train_infer_python.txt new file mode 100644 index 0000000000000000000000000000000000000000..de9df227722c8813fc8a21757b118875157a8a56 --- /dev/null +++ b/test_tipc/configs/det_mv3_db_v2.0/train_infer_python.txt @@ -0,0 +1,51 @@ +===========================train_params=========================== +model_name:det_mv3_db_v2.0 +python:python3.7 +gpu_list:0|0,1 +Global.use_gpu:True|True +Global.auto_cast:null +Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300 +Global.save_model_dir:./output/ +Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 +Global.pretrained_model:null +train_model_name:latest +train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ +null:null +## +trainer:norm_train +norm_train:tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained +pact_train:null +fpgm_train:null +distill_train:null +null:null +null:null +## +===========================eval_params=========================== +eval:null +null:null +## +===========================infer_params=========================== +Global.save_inference_dir:./output/ +Global.pretrained_model: +norm_export:tools/export_model.py -c configs/det/det_mv3_db.yml -o +quant_export:null +fpgm_export:null +distill_export:null +export1:null +export2:null +inference_dir:null +train_model:./inference/det_mv3_db_v2.0_train/best_accuracy +infer_export:tools/export_model.py -c configs/det/det_mv3_db.yml -o +infer_quant:False +inference:tools/infer/predict_det.py +--use_gpu:True|False +--enable_mkldnn:True|False +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False|True +--precision:fp32|fp16|int8 +--det_model_dir: +--image_dir:./inference/ch_det_data_50/all-sum-510/ +null:null +--benchmark:True +null:null \ No newline at end of file diff --git a/test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml b/test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml new file mode 100644 index 0000000000000000000000000000000000000000..d37fdcfbb5b27404403674d99c1b8abe8cd65e85 --- /dev/null +++ b/test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml @@ -0,0 +1,135 @@ +Global: + use_gpu: true + epoch_num: 600 + log_smooth_window: 20 + print_batch_step: 10 + save_model_dir: ./output/det_mv3_pse/ + save_epoch_step: 600 + # evaluation is run every 63 iterations + eval_batch_step: [ 0,1000 ] + cal_metric_during_train: False + pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained + checkpoints: #./output/det_r50_vd_pse_batch8_ColorJitter/best_accuracy + save_inference_dir: + use_visualdl: False + infer_img: doc/imgs_en/img_10.jpg + save_res_path: ./output/det_pse/predicts_pse.txt + +Architecture: + model_type: det + algorithm: PSE + Transform: null + Backbone: + name: MobileNetV3 + scale: 0.5 + model_name: large + Neck: + name: FPN + out_channels: 96 + Head: + name: PSEHead + hidden_dim: 96 + out_channels: 7 + +Loss: + name: PSELoss + alpha: 0.7 + ohem_ratio: 3 + kernel_sample_mask: pred + reduction: none + +Optimizer: + name: Adam + beta1: 0.9 + beta2: 0.999 + lr: + name: Step + learning_rate: 0.001 + step_size: 200 + gamma: 0.1 + regularizer: + name: 'L2' + factor: 0.0005 + +PostProcess: + name: PSEPostProcess + thresh: 0 + box_thresh: 0.85 + min_area: 16 + box_type: box # 'box' or 'poly' + scale: 1 + +Metric: + name: DetMetric + main_indicator: hmean + +Train: + dataset: + name: SimpleDataSet + data_dir: ./train_data/icdar2015/text_localization/ + label_file_list: + - ./train_data/icdar2015/text_localization/train_icdar2015_label.txt + ratio_list: [ 1.0 ] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - DetLabelEncode: # Class handling label + - ColorJitter: + brightness: 0.12549019607843137 + saturation: 0.5 + - IaaAugment: + augmenter_args: + - { 'type': Resize, 'args': { 'size': [ 0.5, 3 ] } } + - { 'type': Fliplr, 'args': { 'p': 0.5 } } + - { 'type': Affine, 'args': { 'rotate': [ -10, 10 ] } } + - MakePseGt: + kernel_num: 7 + min_shrink_ratio: 0.4 + size: 640 + - RandomCropImgMask: + size: [ 640,640 ] + main_key: gt_text + crop_keys: [ 'image', 'gt_text', 'gt_kernels', 'mask' ] + - NormalizeImage: + scale: 1./255. + mean: [ 0.485, 0.456, 0.406 ] + std: [ 0.229, 0.224, 0.225 ] + order: 'hwc' + - ToCHWImage: + - KeepKeys: + keep_keys: [ 'image', 'gt_text', 'gt_kernels', 'mask' ] # the order of the dataloader list + loader: + shuffle: True + drop_last: False + batch_size_per_card: 16 + num_workers: 8 + +Eval: + dataset: + name: SimpleDataSet + data_dir: ./train_data/icdar2015/text_localization/ + label_file_list: + - ./train_data/icdar2015/text_localization/test_icdar2015_label.txt + ratio_list: [ 1.0 ] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - DetLabelEncode: # Class handling label + - DetResizeForTest: + limit_side_len: 736 + limit_type: min + - NormalizeImage: + scale: 1./255. + mean: [ 0.485, 0.456, 0.406 ] + std: [ 0.229, 0.224, 0.225 ] + order: 'hwc' + - ToCHWImage: + - KeepKeys: + keep_keys: [ 'image', 'shape', 'polys', 'ignore_tags' ] + loader: + shuffle: False + drop_last: False + batch_size_per_card: 1 # must be 1 + num_workers: 8 \ No newline at end of file diff --git a/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt b/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt new file mode 100644 index 0000000000000000000000000000000000000000..f9909027f10d9e9f96d65f9f5a1c5f3fd5c9e1c6 --- /dev/null +++ b/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt @@ -0,0 +1,51 @@ +===========================train_params=========================== +model_name:det_mv3_pse_v2.0 +python:python3.7 +gpu_list:0 +Global.use_gpu:True|True +Global.auto_cast:fp32 +Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500 +Global.save_model_dir:./output/ +Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 +Global.pretrained_model:null +train_model_name:latest +train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ +null:null +## +trainer:norm_train +norm_train:tools/train.py -c test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml -o +pact_train:null +fpgm_train:null +distill_train:null +null:null +null:null +## +===========================eval_params=========================== +eval:null +null:null +## +===========================infer_params=========================== +Global.save_inference_dir:./output/ +Global.pretrained_model: +norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml -o +quant_export:null +fpgm_export:null +distill_export:null +export1:null +export2:null +## +train_model:./inference/det_mv3_pse/best_accuracy +infer_export:tools/export_model.py -c test_tipc/cconfigs/det_mv3_pse_v2.0/det_mv3_pse.yml -o +infer_quant:False +inference:tools/infer/predict_det.py +--use_gpu:True|False +--enable_mkldnn:True|False +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False|True +--precision:fp32|fp16|int8 +--det_model_dir: +--image_dir:./inference/ch_det_data_50/all-sum-510/ +--save_log_path:null +--benchmark:True +--det_algorithm:PSE diff --git a/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt new file mode 100644 index 0000000000000000000000000000000000000000..72b5ede15d27a33477b5ecd8302aaccc0ab56413 --- /dev/null +++ b/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt @@ -0,0 +1,51 @@ +===========================train_params=========================== +model_name:det_r50_db_v2.0 +python:python3.7 +gpu_list:0|0,1 +Global.use_gpu:True|True +Global.auto_cast:null +Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300 +Global.save_model_dir:./output/ +Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_lite_infer=4 +Global.pretrained_model:null +train_model_name:latest +train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ +null:null +## +trainer:norm_train +norm_train:tools/train.py -c configs/det/det_r50_vd_db.yml -o +quant_export:null +fpgm_export:null +distill_train:null +null:null +null:null +## +===========================eval_params=========================== +eval:tools/eval.py -c configs/det/det_r50_vd_db.yml -o +null:null +## +===========================infer_params=========================== +Global.save_inference_dir:./output/ +Global.pretrained_model: +norm_export:tools/export_model.py -c configs/det/det_r50_vd_db.yml -o +quant_export:null +fpgm_export:null +distill_export:null +export1:null +export2:null +## +train_model:./inference/ch_ppocr_server_v2.0_det_train/best_accuracy +infer_export:tools/export_model.py -c configs/det/det_r50_vd_db.yml -o +infer_quant:False +inference:tools/infer/predict_det.py +--use_gpu:True|False +--enable_mkldnn:True|False +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False|True +--precision:fp32|fp16|int8 +--det_model_dir: +--image_dir:./inference/ch_det_data_50/all-sum-510/ +--save_log_path:null +--benchmark:True +null:null \ No newline at end of file diff --git a/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt index e9eaa779520f78622509153482fd6a84322c9cc5..dfb376237ee35c277fcd86a88328c562d5c0429a 100644 --- a/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt +++ b/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt @@ -34,7 +34,7 @@ distill_export:null export1:null export2:null ## -train_model:./inference/det_mv3_east/best_accuracy +train_model:./inference/det_r50_vd_east/best_accuracy infer_export:tools/export_model.py -c test_tipc/cconfigs/det_r50_vd_east_v2.0/det_r50_vd_east.yml -o infer_quant:False inference:tools/infer/predict_det.py diff --git a/test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml b/test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml new file mode 100644 index 0000000000000000000000000000000000000000..5ebc4252718d5572837eac58061bf6f9eb35bf73 --- /dev/null +++ b/test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml @@ -0,0 +1,134 @@ +Global: + use_gpu: true + epoch_num: 600 + log_smooth_window: 20 + print_batch_step: 10 + save_model_dir: ./output/det_r50_vd_pse/ + save_epoch_step: 600 + # evaluation is run every 125 iterations + eval_batch_step: [ 0,1000 ] + cal_metric_during_train: False + pretrained_model: + checkpoints: #./output/det_r50_vd_pse_batch8_ColorJitter/best_accuracy + save_inference_dir: + use_visualdl: False + infer_img: doc/imgs_en/img_10.jpg + save_res_path: ./output/det_pse/predicts_pse.txt + +Architecture: + model_type: det + algorithm: PSE + Transform: + Backbone: + name: ResNet + layers: 50 + Neck: + name: FPN + out_channels: 256 + Head: + name: PSEHead + hidden_dim: 256 + out_channels: 7 + +Loss: + name: PSELoss + alpha: 0.7 + ohem_ratio: 3 + kernel_sample_mask: pred + reduction: none + +Optimizer: + name: Adam + beta1: 0.9 + beta2: 0.999 + lr: + name: Step + learning_rate: 0.0001 + step_size: 200 + gamma: 0.1 + regularizer: + name: 'L2' + factor: 0.0005 + +PostProcess: + name: PSEPostProcess + thresh: 0 + box_thresh: 0.85 + min_area: 16 + box_type: box # 'box' or 'poly' + scale: 1 + +Metric: + name: DetMetric + main_indicator: hmean + +Train: + dataset: + name: SimpleDataSet + data_dir: ./train_data/icdar2015/text_localization/ + label_file_list: + - ./train_data/icdar2015/text_localization/train_icdar2015_label.txt + ratio_list: [ 1.0 ] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - DetLabelEncode: # Class handling label + - ColorJitter: + brightness: 0.12549019607843137 + saturation: 0.5 + - IaaAugment: + augmenter_args: + - { 'type': Resize, 'args': { 'size': [ 0.5, 3 ] } } + - { 'type': Fliplr, 'args': { 'p': 0.5 } } + - { 'type': Affine, 'args': { 'rotate': [ -10, 10 ] } } + - MakePseGt: + kernel_num: 7 + min_shrink_ratio: 0.4 + size: 640 + - RandomCropImgMask: + size: [ 640,640 ] + main_key: gt_text + crop_keys: [ 'image', 'gt_text', 'gt_kernels', 'mask' ] + - NormalizeImage: + scale: 1./255. + mean: [ 0.485, 0.456, 0.406 ] + std: [ 0.229, 0.224, 0.225 ] + order: 'hwc' + - ToCHWImage: + - KeepKeys: + keep_keys: [ 'image', 'gt_text', 'gt_kernels', 'mask' ] # the order of the dataloader list + loader: + shuffle: True + drop_last: False + batch_size_per_card: 8 + num_workers: 8 + +Eval: + dataset: + name: SimpleDataSet + data_dir: ./train_data/icdar2015/text_localization/ + label_file_list: + - ./train_data/icdar2015/text_localization/test_icdar2015_label.txt + ratio_list: [ 1.0 ] + transforms: + - DecodeImage: # load image + img_mode: BGR + channel_first: False + - DetLabelEncode: # Class handling label + - DetResizeForTest: + limit_side_len: 736 + limit_type: min + - NormalizeImage: + scale: 1./255. + mean: [ 0.485, 0.456, 0.406 ] + std: [ 0.229, 0.224, 0.225 ] + order: 'hwc' + - ToCHWImage: + - KeepKeys: + keep_keys: [ 'image', 'shape', 'polys', 'ignore_tags' ] + loader: + shuffle: False + drop_last: False + batch_size_per_card: 1 # must be 1 + num_workers: 8 \ No newline at end of file diff --git a/test_tipc/configs/det_r50_vd_pse_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_pse_v2.0/train_infer_python.txt new file mode 100644 index 0000000000000000000000000000000000000000..5ab6d45d7c1eb5e3c17fd53a8c8c504812c1012c --- /dev/null +++ b/test_tipc/configs/det_r50_vd_pse_v2.0/train_infer_python.txt @@ -0,0 +1,51 @@ +===========================train_params=========================== +model_name:det_r50_vd_pse_v2.0 +python:python3.7 +gpu_list:0 +Global.use_gpu:True|True +Global.auto_cast:fp32 +Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500 +Global.save_model_dir:./output/ +Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 +Global.pretrained_model:null +train_model_name:latest +train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/ +null:null +## +trainer:norm_train +norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o +pact_train:null +fpgm_train:null +distill_train:null +null:null +null:null +## +===========================eval_params=========================== +eval:null +null:null +## +===========================infer_params=========================== +Global.save_inference_dir:./output/ +Global.pretrained_model: +norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o +quant_export:null +fpgm_export:null +distill_export:null +export1:null +export2:null +## +train_model:./inference/det_r50_vd_pse/best_accuracy +infer_export:tools/export_model.py -c test_tipc/cconfigs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o +infer_quant:False +inference:tools/infer/predict_det.py +--use_gpu:True|False +--enable_mkldnn:True|False +--cpu_threads:1|6 +--rec_batch_num:1 +--use_tensorrt:False|True +--precision:fp32|fp16|int8 +--det_model_dir: +--image_dir:./inference/ch_det_data_50/all-sum-510/ +--save_log_path:null +--benchmark:True +--det_algorithm:PSE diff --git a/test_tipc/prepare.sh b/test_tipc/prepare.sh index a0006375fd5d4998f2342afa4fe5bdfb3e9828ee..dec5742368a7f3f7ce197d835e7428d7478a1125 100644 --- a/test_tipc/prepare.sh +++ b/test_tipc/prepare.sh @@ -52,10 +52,16 @@ if [ ${MODE} = "lite_train_lite_infer" ];then wget -nc -P ./train_data/ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/total_text_lite.tar --no-check-certificate cd ./train_data && tar xf total_text_lite.tar && ln -s total_text && cd ../ fi - if [ ${model_name} == "rec_resnet_stn_bilstm_att_v2.0" ]; then - wget -nc https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz - gunzip cc.en.300.bin.gz + if [ ${model_name} == "det_mv3_db_v2.0" ]; then + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate + cd ./inference/ && tar xf det_mv3_db_v2.0_train.tar && cd ../ fi + if [ ${model_name} == "det_r50_db_v2.0" ]; then + wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_ssld_pretrained.pdparams --no-check-certificate + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar --no-check-certificate + cd ./inference/ && tar xf det_r50_vd_db_v2.0_train.tar && cd ../ + fi + elif [ ${MODE} = "whole_train_whole_infer" ];then wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate rm -rf ./train_data/icdar2015 @@ -104,12 +110,12 @@ elif [ ${MODE} = "whole_infer" ];then wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate cd ./inference && tar xf ch_ppocr_server_v2.0_det_train.tar && tar xf ch_det_data_50.tar && cd ../ - elif [ ${model_name} = "ocr_system_mobile" ]; then + elif [ ${model_name} = "ch_ppocr_mobile_v2.0" ]; then wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../ - elif [ ${model_name} = "ocr_system_server" ]; then + elif [ ${model_name} = "ch_ppocr_server_v2.0" ]; then wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate @@ -125,7 +131,7 @@ elif [ ${MODE} = "whole_infer" ];then wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate cd ./inference && tar xf ${eval_model_name}.tar && tar xf rec_inference.tar && cd ../ fi - elif [ ${model_name} = "ch_PPOCRv2_det" ]; then + if [ ${model_name} = "ch_PPOCRv2_det" ]; then eval_model_name="ch_PP-OCRv2_det_infer" wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate @@ -137,11 +143,22 @@ elif [ ${MODE} = "whole_infer" ];then fi if [ ${model_name} == "det_r50_vd_sast_icdar15_v2.0" ]; then wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar --no-check-certificate - cd ./inference/ && tar det_r50_vd_sast_icdar15_v2.0_train.tar && cd ../ + cd ./inference/ && tar xf det_r50_vd_sast_icdar15_v2.0_train.tar && cd ../ fi - + if [ ${model_name} == "det_mv3_db_v2.0" ]; then + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate + cd ./inference/ && tar xf det_mv3_db_v2.0_train.tar && cd ../ + fi + if [ ${model_name} == "det_r50_db_v2.0" ]; then + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar --no-check-certificate + cd ./inference/ && tar xf det_r50_vd_db_v2.0_train.tar && cd ../ + fi +fi if [ ${MODE} = "klquant_whole_infer" ]; then - if [ ${model_name} = "ch_ppocr_mobile_v2.0_det" ]; then + wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_lite.tar --no-check-certificate + cd ./train_data/ && tar xf icdar2015_lite.tar + ln -s ./icdar2015_lite ./icdar2015 && cd ../ + if [ ${model_name} = "ch_ppocr_mobile_v2.0_det_KL" ]; then wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../ @@ -152,6 +169,13 @@ if [ ${MODE} = "klquant_whole_infer" ]; then wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../ fi + if [ ${model_name} = "ch_ppocr_mobile_v2.0_rec_KL" ]; then + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate + wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate + wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate + cd ./train_data/ && tar xf ic15_data.tar && cd ../ + cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf rec_inference.tar && cd ../ + fi fi if [ ${MODE} = "cpp_infer" ];then diff --git a/test_tipc/test_train_inference_python.sh b/test_tipc/test_train_inference_python.sh index 7d035256527e01f31a4a1bc113caff3c744d859d..2a563022aa66c80bd2c9b6c2f822bee5ac3f3c18 100644 --- a/test_tipc/test_train_inference_python.sh +++ b/test_tipc/test_train_inference_python.sh @@ -90,36 +90,38 @@ infer_value1=$(func_parser_value "${lines[50]}") # parser klquant_infer if [ ${MODE} = "klquant_whole_infer" ]; then - dataline=$(awk 'NR==1 NR==17{print}' $FILENAME) + dataline=$(awk 'NR==1, NR==17{print}' $FILENAME) lines=(${dataline}) model_name=$(func_parser_value "${lines[1]}") python=$(func_parser_value "${lines[2]}") + export_weight=$(func_parser_key "${lines[3]}") + save_infer_key=$(func_parser_key "${lines[4]}") # parser inference model - infer_model_dir_list=$(func_parser_value "${lines[3]}") - infer_export_list=$(func_parser_value "${lines[4]}") - infer_is_quant=$(func_parser_value "${lines[5]}") + infer_model_dir_list=$(func_parser_value "${lines[5]}") + infer_export_list=$(func_parser_value "${lines[6]}") + infer_is_quant=$(func_parser_value "${lines[7]}") # parser inference - inference_py=$(func_parser_value "${lines[6]}") - use_gpu_key=$(func_parser_key "${lines[7]}") - use_gpu_list=$(func_parser_value "${lines[7]}") - use_mkldnn_key=$(func_parser_key "${lines[8]}") - use_mkldnn_list=$(func_parser_value "${lines[8]}") - cpu_threads_key=$(func_parser_key "${lines[9]}") - cpu_threads_list=$(func_parser_value "${lines[9]}") - batch_size_key=$(func_parser_key "${lines[10]}") - batch_size_list=$(func_parser_value "${lines[10]}") - use_trt_key=$(func_parser_key "${lines[11]}") - use_trt_list=$(func_parser_value "${lines[11]}") - precision_key=$(func_parser_key "${lines[12]}") - precision_list=$(func_parser_value "${lines[12]}") - infer_model_key=$(func_parser_key "${lines[13]}") - image_dir_key=$(func_parser_key "${lines[14]}") - infer_img_dir=$(func_parser_value "${lines[14]}") - save_log_key=$(func_parser_key "${lines[15]}") - benchmark_key=$(func_parser_key "${lines[16]}") - benchmark_value=$(func_parser_value "${lines[16]}") - infer_key1=$(func_parser_key "${lines[17]}") - infer_value1=$(func_parser_value "${lines[17]}") + inference_py=$(func_parser_value "${lines[8]}") + use_gpu_key=$(func_parser_key "${lines[9]}") + use_gpu_list=$(func_parser_value "${lines[9]}") + use_mkldnn_key=$(func_parser_key "${lines[10]}") + use_mkldnn_list=$(func_parser_value "${lines[10]}") + cpu_threads_key=$(func_parser_key "${lines[11]}") + cpu_threads_list=$(func_parser_value "${lines[11]}") + batch_size_key=$(func_parser_key "${lines[12]}") + batch_size_list=$(func_parser_value "${lines[12]}") + use_trt_key=$(func_parser_key "${lines[13]}") + use_trt_list=$(func_parser_value "${lines[13]}") + precision_key=$(func_parser_key "${lines[14]}") + precision_list=$(func_parser_value "${lines[14]}") + infer_model_key=$(func_parser_key "${lines[15]}") + image_dir_key=$(func_parser_key "${lines[16]}") + infer_img_dir=$(func_parser_value "${lines[16]}") + save_log_key=$(func_parser_key "${lines[17]}") + benchmark_key=$(func_parser_key "${lines[18]}") + benchmark_value=$(func_parser_value "${lines[18]}") + infer_key1=$(func_parser_key "${lines[19]}") + infer_value1=$(func_parser_value "${lines[19]}") fi LOG_PATH="./test_tipc/output" @@ -235,7 +237,7 @@ if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then fi #run inference is_quant=${infer_quant_flag[Count]} - if [ ${MODE} = "klquant_infer" ]; then + if [ ${MODE} = "klquant_whole_infer" ]; then is_quant="True" fi func_inference "${python}" "${inference_py}" "${save_infer_dir}" "${LOG_PATH}" "${infer_img_dir}" ${is_quant} diff --git a/tools/infer_det.py b/tools/infer_det.py index bb2cca7362e81494018aa3471664d60bef1b852c..1c679e0faf0d3ebdb6ca7ed4c317ce3eecfa910f 100755 --- a/tools/infer_det.py +++ b/tools/infer_det.py @@ -53,6 +53,7 @@ def draw_det_res(dt_boxes, config, img, img_name, save_path): logger.info("The detected Image saved in {}".format(save_path)) +@paddle.no_grad() def main(): global_config = config['Global']