diff --git a/doc/doc_ch/algorithm_det_east.md b/doc/doc_ch/algorithm_det_east.md index 94a0d097d803cf5a74461be8faaadcabbd28938d..ef60e1e0752d61ea468c044e427d0df963b64b0a 100644 --- a/doc/doc_ch/algorithm_det_east.md +++ b/doc/doc_ch/algorithm_det_east.md @@ -26,8 +26,8 @@ |模型|骨干网络|配置文件|precision|recall|Hmean|下载链接| | --- | --- | --- | --- | --- | --- | --- | -|EAST|ResNet50_vd|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| -|EAST| MobileNetV3| 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| +|EAST|ResNet50_vd| [det_r50_vd_east.yml](../../configs/det/det_r50_vd_east.yml)|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| +|EAST|MobileNetV3|[det_mv3_east.yml](../../configs/det/det_mv3_east.yml) | 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)| diff --git a/doc/doc_ch/algorithm_det_sast.md b/doc/doc_ch/algorithm_det_sast.md index 038d73fc15f3203bbcc17997c1a8e1c208f80ba8..f18eaf1a44cb18430fbc3f28d2451ac85e524863 100644 --- a/doc/doc_ch/algorithm_det_sast.md +++ b/doc/doc_ch/algorithm_det_sast.md @@ -73,9 +73,9 @@ python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Gl ``` -SAST文本检测模型推理,需要设置参数`--det_algorithm="SAST"`,同时,还需要增加参数`--det_sast_polygon=True`,可以执行如下命令: +SAST文本检测模型推理,需要设置参数`--det_algorithm="SAST"`,同时,还需要增加参数`--det_box_type=poly`,可以执行如下命令: ``` -python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True +python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_box_type='poly' ``` 可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下: diff --git a/doc/doc_ch/inference_args.md b/doc/doc_ch/inference_args.md index 24e7223e397c94fe65b0f26d993fc507b323ed16..f6b7cbf06692c250db87c3c1863acc5ee0cf0cda 100644 --- a/doc/doc_ch/inference_args.md +++ b/doc/doc_ch/inference_args.md @@ -70,7 +70,7 @@ SAST算法相关参数如下 | :--: | :--: | :--: | :--: | | det_sast_score_thresh | float | 0.5 | SAST后处理中的得分阈值 | | det_sast_nms_thresh | float | 0.5 | SAST后处理中nms的阈值 | -| det_sast_polygon | bool | False | 是否多边形检测,弯曲文本场景(如Total-Text)设置为True | +| det_box_type | str | quad | 是否多边形检测,弯曲文本场景(如Total-Text)设置为'poly' | PSE算法相关参数如下 @@ -79,7 +79,7 @@ PSE算法相关参数如下 | det_pse_thresh | float | 0.0 | 对输出图做二值化的阈值 | | det_pse_box_thresh | float | 0.85 | 对box进行过滤的阈值,低于此阈值的丢弃 | | det_pse_min_area | float | 16 | box的最小面积,低于此阈值的丢弃 | -| det_pse_box_type | str | "box" | 返回框的类型,box:四点坐标,poly: 弯曲文本的所有点坐标 | +| det_box_type | str | "quad" | 返回框的类型,quad:四点坐标,poly: 弯曲文本的所有点坐标 | | det_pse_scale | int | 1 | 输入图像相对于进后处理的图的比例,如`640*640`的图像,网络输出为`160*160`,scale为2的情况下,进后处理的图片shape为`320*320`。这个值调大可以加快后处理速度,但是会带来精度的下降 | * 文本识别模型相关 diff --git a/doc/doc_en/algorithm_det_east_en.md b/doc/doc_en/algorithm_det_east_en.md index 3848464abfd275fd319a24b0d3f6b3522c06c4a2..85440debfabc9fc8edf9701ba991d173b9da58cb 100644 --- a/doc/doc_en/algorithm_det_east_en.md +++ b/doc/doc_en/algorithm_det_east_en.md @@ -26,8 +26,9 @@ On the ICDAR2015 dataset, the text detection result is as follows: |Model|Backbone|Configuration|Precision|Recall|Hmean|Download| | --- | --- | --- | --- | --- | --- | --- | -|EAST|ResNet50_vd|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| -|EAST| MobileNetV3| 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| +|EAST|ResNet50_vd| [det_r50_vd_east.yml](../../configs/det/det_r50_vd_east.yml)|88.71%| 81.36%| 84.88%| [model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)| +|EAST|MobileNetV3|[det_mv3_east.yml](../../configs/det/det_mv3_east.yml) | 78.20%| 79.10%| 78.65%| [model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)| + diff --git a/doc/doc_en/algorithm_det_sast_en.md b/doc/doc_en/algorithm_det_sast_en.md index e3437d22be9d75835aaa43e72363b498225db9e1..dde8eb32dc1d75270fa18155548a9fa6242c4215 100644 --- a/doc/doc_en/algorithm_det_sast_en.md +++ b/doc/doc_en/algorithm_det_sast_en.md @@ -74,10 +74,10 @@ First, convert the model saved in the SAST text detection training process into python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Global.pretrained_model=./det_r50_vd_sast_totaltext_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_sast_tt ``` -For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_sast_polygon=True`, run the following command: +For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_box_type=poly`, run the following command: ``` -python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True +python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_box_type='poly' ``` The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows: diff --git a/doc/doc_en/inference_args_en.md b/doc/doc_en/inference_args_en.md index b28cd8436da62dcd10f96f17751db9384ebcaa8d..3ace7324f54dfc9829546ce2f5e6679559619d63 100644 --- a/doc/doc_en/inference_args_en.md +++ b/doc/doc_en/inference_args_en.md @@ -70,7 +70,7 @@ The relevant parameters of the SAST algorithm are as follows | :--: | :--: | :--: | :--: | | det_sast_score_thresh | float | 0.5 | Score thresholds in SAST postprocess | | det_sast_nms_thresh | float | 0.5 | Thresholding of nms in SAST postprocess | -| det_sast_polygon | bool | False | Whether polygon detection, curved text scene (such as Total-Text) is set to True | +| det_box_type | str | 'quad' | Whether polygon detection, curved text scene (such as Total-Text) is set to 'poly' | The relevant parameters of the PSE algorithm are as follows @@ -79,7 +79,7 @@ The relevant parameters of the PSE algorithm are as follows | det_pse_thresh | float | 0.0 | Threshold for binarizing the output image | | det_pse_box_thresh | float | 0.85 | Threshold for filtering boxes, below this threshold is discarded | | det_pse_min_area | float | 16 | The minimum area of the box, below this threshold is discarded | -| det_pse_box_type | str | "box" | The type of the returned box, box: four point coordinates, poly: all point coordinates of the curved text | +| det_box_type | str | "quad" | The type of the returned box, quad: four point coordinates, poly: all point coordinates of the curved text | | det_pse_scale | int | 1 | The ratio of the input image relative to the post-processed image, such as an image of `640*640`, the network output is `160*160`, and when the scale is 2, the shape of the post-processed image is `320*320`. Increasing this value can speed up the post-processing speed, but it will bring about a decrease in accuracy | * Text recognition model related parameters diff --git a/ppocr/data/imaug/ct_process.py b/ppocr/data/imaug/ct_process.py index 59715090036e1020800950b02b9ea06ab5c8d4c2..933d42f98c068780c2140740eddbc553cec02ee6 100644 --- a/ppocr/data/imaug/ct_process.py +++ b/ppocr/data/imaug/ct_process.py @@ -19,7 +19,8 @@ import pyclipper import paddle import numpy as np -import Polygon as plg +from ppocr.utils.utility import check_install + import scipy.io as scio from PIL import Image @@ -70,6 +71,8 @@ class MakeShrink(): return peri def shrink(self, bboxes, rate, max_shr=20): + check_install('Polygon', 'Polygon3') + import Polygon as plg rate = rate * rate shrinked_bboxes = [] for bbox in bboxes: diff --git a/ppocr/data/imaug/drrg_targets.py b/ppocr/data/imaug/drrg_targets.py index c56e878b837328ef2efde40b96b5571dffbb4791..7fdfd096819b266290353d842ef531e8220c586c 100644 --- a/ppocr/data/imaug/drrg_targets.py +++ b/ppocr/data/imaug/drrg_targets.py @@ -18,7 +18,7 @@ https://github.com/open-mmlab/mmocr/blob/main/mmocr/datasets/pipelines/textdet_t import cv2 import numpy as np -from lanms import merge_quadrangle_n9 as la_nms +from ppocr.utils.utility import check_install from numpy.linalg import norm @@ -543,6 +543,8 @@ class DRRGTargets(object): score = np.ones((text_comps.shape[0], 1), dtype=np.float32) text_comps = np.hstack([text_comps, score]) + check_install('lanms', 'lanms-neo') + from lanms import merge_quadrangle_n9 as la_nms text_comps = la_nms(text_comps, self.text_comp_nms_thr) if text_comps.shape[0] >= 1: diff --git a/ppocr/postprocess/east_postprocess.py b/ppocr/postprocess/east_postprocess.py index c194c81c6911aac0f9210109c37b76b44532e9c4..e9ba095d3993ecb614b9569a7bb3c743268773ea 100755 --- a/ppocr/postprocess/east_postprocess.py +++ b/ppocr/postprocess/east_postprocess.py @@ -22,6 +22,7 @@ import cv2 import paddle import os +from ppocr.utils.utility import check_install import sys @@ -78,11 +79,11 @@ class EASTPostProcess(object): boxes[:, 8] = score_map[xy_text[:, 0], xy_text[:, 1]] try: + check_install('lanms', 'lanms-nova') import lanms - boxes = lanms.merge_quadrangle_n9(boxes, nms_thresh) except: print( - 'you should install lanms by pip3 install lanms-nova to speed up nms_locality' + 'You should install lanms by pip3 install lanms-nova to speed up nms_locality' ) boxes = nms_locality(boxes.astype(np.float64), nms_thresh) if boxes.shape[0] == 0: diff --git a/ppocr/postprocess/sast_postprocess.py b/ppocr/postprocess/sast_postprocess.py index bee75c05b1a3ea59193d566f91378c96797f533b..594bf17d6a0db2ebee17e7476834ce7b6b4289e6 100755 --- a/ppocr/postprocess/sast_postprocess.py +++ b/ppocr/postprocess/sast_postprocess.py @@ -141,6 +141,8 @@ class SASTPostProcess(object): def nms(self, dets): if self.is_python35: + from ppocr.utils.utility import check_install + check_install('lanms', 'lanms-nova') import lanms dets = lanms.merge_quadrangle_n9(dets, self.nms_thresh) else: diff --git a/ppocr/utils/e2e_metric/Deteval.py b/ppocr/utils/e2e_metric/Deteval.py index 6ce56eda2aa9f38fdc712d49ae64945c558b418d..387b3c24e6aa78fc2a7b4e03d5c133c3c6c61112 100755 --- a/ppocr/utils/e2e_metric/Deteval.py +++ b/ppocr/utils/e2e_metric/Deteval.py @@ -15,7 +15,11 @@ import json import numpy as np import scipy.io as io + +from ppocr.utils.utility import check_install +check_install("Polygon", "Polygon3") import Polygon as plg + from ppocr.utils.e2e_metric.polygon_fast import iod, area_of_intersection, area diff --git a/ppocr/utils/utility.py b/ppocr/utils/utility.py index 18357c8e97bcea8ee321856a87146a4a7b901469..0f8660ceb4f0297f3e5c81d3b60f53cecab42070 100755 --- a/ppocr/utils/utility.py +++ b/ppocr/utils/utility.py @@ -19,6 +19,9 @@ import cv2 import random import numpy as np import paddle +import importlib.util +import sys +import subprocess def print_dict(d, logger, delimiter=0): @@ -131,6 +134,26 @@ def set_seed(seed=1024): paddle.seed(seed) +def check_install(module_name, install_name): + spec = importlib.util.find_spec(module_name) + if spec is None: + print(f'Warnning! The {module_name} module is NOT installed') + print( + f'Try install {module_name} module automatically. You can also try to install manually by pip install {install_name}.' + ) + python = sys.executable + try: + subprocess.check_call( + [python, '-m', 'pip', 'install', install_name], + stdout=subprocess.DEVNULL) + print(f'The {module_name} module is now installed') + except subprocess.CalledProcessError as exc: + raise Exception( + f"Install {module_name} failed, please install manually") + else: + print(f"{module_name} has been installed.") + + class AverageMeter: def __init__(self): self.reset() diff --git a/requirements.txt b/requirements.txt index 9b73e3bf8bb4fbbed1961aa4d76ab3a6b8e8e3a5..f3d9ce89e3e2ae9079598d37f75b9e4e63d871a6 100644 --- a/requirements.txt +++ b/requirements.txt @@ -13,7 +13,5 @@ cython lxml premailer openpyxl -attrdict3 -Polygon3 -lanms-neo==1.0.2 +attrdict PyMuPDF==1.19.0 \ No newline at end of file