diff --git a/doc/doc_ch/algorithm_det_east.md b/doc/doc_ch/algorithm_det_east.md
index 94a0d097d803cf5a74461be8faaadcabbd28938d..ef60e1e0752d61ea468c044e427d0df963b64b0a 100644
--- a/doc/doc_ch/algorithm_det_east.md
+++ b/doc/doc_ch/algorithm_det_east.md
@@ -26,8 +26,8 @@
|模型|骨干网络|配置文件|precision|recall|Hmean|下载链接|
| --- | --- | --- | --- | --- | --- | --- |
-|EAST|ResNet50_vd|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
-|EAST| MobileNetV3| 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
+|EAST|ResNet50_vd| [det_r50_vd_east.yml](../../configs/det/det_r50_vd_east.yml)|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
+|EAST|MobileNetV3|[det_mv3_east.yml](../../configs/det/det_mv3_east.yml) | 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)|
diff --git a/doc/doc_ch/algorithm_det_sast.md b/doc/doc_ch/algorithm_det_sast.md
index 038d73fc15f3203bbcc17997c1a8e1c208f80ba8..f18eaf1a44cb18430fbc3f28d2451ac85e524863 100644
--- a/doc/doc_ch/algorithm_det_sast.md
+++ b/doc/doc_ch/algorithm_det_sast.md
@@ -73,9 +73,9 @@ python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Gl
```
-SAST文本检测模型推理,需要设置参数`--det_algorithm="SAST"`,同时,还需要增加参数`--det_sast_polygon=True`,可以执行如下命令:
+SAST文本检测模型推理,需要设置参数`--det_algorithm="SAST"`,同时,还需要增加参数`--det_box_type=poly`,可以执行如下命令:
```
-python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True
+python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_box_type='poly'
```
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
diff --git a/doc/doc_ch/inference_args.md b/doc/doc_ch/inference_args.md
index 24e7223e397c94fe65b0f26d993fc507b323ed16..f6b7cbf06692c250db87c3c1863acc5ee0cf0cda 100644
--- a/doc/doc_ch/inference_args.md
+++ b/doc/doc_ch/inference_args.md
@@ -70,7 +70,7 @@ SAST算法相关参数如下
| :--: | :--: | :--: | :--: |
| det_sast_score_thresh | float | 0.5 | SAST后处理中的得分阈值 |
| det_sast_nms_thresh | float | 0.5 | SAST后处理中nms的阈值 |
-| det_sast_polygon | bool | False | 是否多边形检测,弯曲文本场景(如Total-Text)设置为True |
+| det_box_type | str | quad | 是否多边形检测,弯曲文本场景(如Total-Text)设置为'poly' |
PSE算法相关参数如下
@@ -79,7 +79,7 @@ PSE算法相关参数如下
| det_pse_thresh | float | 0.0 | 对输出图做二值化的阈值 |
| det_pse_box_thresh | float | 0.85 | 对box进行过滤的阈值,低于此阈值的丢弃 |
| det_pse_min_area | float | 16 | box的最小面积,低于此阈值的丢弃 |
-| det_pse_box_type | str | "box" | 返回框的类型,box:四点坐标,poly: 弯曲文本的所有点坐标 |
+| det_box_type | str | "quad" | 返回框的类型,quad:四点坐标,poly: 弯曲文本的所有点坐标 |
| det_pse_scale | int | 1 | 输入图像相对于进后处理的图的比例,如`640*640`的图像,网络输出为`160*160`,scale为2的情况下,进后处理的图片shape为`320*320`。这个值调大可以加快后处理速度,但是会带来精度的下降 |
* 文本识别模型相关
diff --git a/doc/doc_en/algorithm_det_east_en.md b/doc/doc_en/algorithm_det_east_en.md
index 3848464abfd275fd319a24b0d3f6b3522c06c4a2..85440debfabc9fc8edf9701ba991d173b9da58cb 100644
--- a/doc/doc_en/algorithm_det_east_en.md
+++ b/doc/doc_en/algorithm_det_east_en.md
@@ -26,8 +26,9 @@ On the ICDAR2015 dataset, the text detection result is as follows:
|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
| --- | --- | --- | --- | --- | --- | --- |
-|EAST|ResNet50_vd|88.71%| 81.36%| 84.88%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
-|EAST| MobileNetV3| 78.20%| 79.10%| 78.65%| [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
+|EAST|ResNet50_vd| [det_r50_vd_east.yml](../../configs/det/det_r50_vd_east.yml)|88.71%| 81.36%| 84.88%| [model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
+|EAST|MobileNetV3|[det_mv3_east.yml](../../configs/det/det_mv3_east.yml) | 78.20%| 79.10%| 78.65%| [model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)|
+
diff --git a/doc/doc_en/algorithm_det_sast_en.md b/doc/doc_en/algorithm_det_sast_en.md
index e3437d22be9d75835aaa43e72363b498225db9e1..dde8eb32dc1d75270fa18155548a9fa6242c4215 100644
--- a/doc/doc_en/algorithm_det_sast_en.md
+++ b/doc/doc_en/algorithm_det_sast_en.md
@@ -74,10 +74,10 @@ First, convert the model saved in the SAST text detection training process into
python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Global.pretrained_model=./det_r50_vd_sast_totaltext_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_sast_tt
```
-For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_sast_polygon=True`, run the following command:
+For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_box_type=poly`, run the following command:
```
-python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True
+python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_box_type='poly'
```
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
diff --git a/doc/doc_en/inference_args_en.md b/doc/doc_en/inference_args_en.md
index b28cd8436da62dcd10f96f17751db9384ebcaa8d..3ace7324f54dfc9829546ce2f5e6679559619d63 100644
--- a/doc/doc_en/inference_args_en.md
+++ b/doc/doc_en/inference_args_en.md
@@ -70,7 +70,7 @@ The relevant parameters of the SAST algorithm are as follows
| :--: | :--: | :--: | :--: |
| det_sast_score_thresh | float | 0.5 | Score thresholds in SAST postprocess |
| det_sast_nms_thresh | float | 0.5 | Thresholding of nms in SAST postprocess |
-| det_sast_polygon | bool | False | Whether polygon detection, curved text scene (such as Total-Text) is set to True |
+| det_box_type | str | 'quad' | Whether polygon detection, curved text scene (such as Total-Text) is set to 'poly' |
The relevant parameters of the PSE algorithm are as follows
@@ -79,7 +79,7 @@ The relevant parameters of the PSE algorithm are as follows
| det_pse_thresh | float | 0.0 | Threshold for binarizing the output image |
| det_pse_box_thresh | float | 0.85 | Threshold for filtering boxes, below this threshold is discarded |
| det_pse_min_area | float | 16 | The minimum area of the box, below this threshold is discarded |
-| det_pse_box_type | str | "box" | The type of the returned box, box: four point coordinates, poly: all point coordinates of the curved text |
+| det_box_type | str | "quad" | The type of the returned box, quad: four point coordinates, poly: all point coordinates of the curved text |
| det_pse_scale | int | 1 | The ratio of the input image relative to the post-processed image, such as an image of `640*640`, the network output is `160*160`, and when the scale is 2, the shape of the post-processed image is `320*320`. Increasing this value can speed up the post-processing speed, but it will bring about a decrease in accuracy |
* Text recognition model related parameters
diff --git a/ppocr/data/imaug/ct_process.py b/ppocr/data/imaug/ct_process.py
index 59715090036e1020800950b02b9ea06ab5c8d4c2..933d42f98c068780c2140740eddbc553cec02ee6 100644
--- a/ppocr/data/imaug/ct_process.py
+++ b/ppocr/data/imaug/ct_process.py
@@ -19,7 +19,8 @@ import pyclipper
import paddle
import numpy as np
-import Polygon as plg
+from ppocr.utils.utility import check_install
+
import scipy.io as scio
from PIL import Image
@@ -70,6 +71,8 @@ class MakeShrink():
return peri
def shrink(self, bboxes, rate, max_shr=20):
+ check_install('Polygon', 'Polygon3')
+ import Polygon as plg
rate = rate * rate
shrinked_bboxes = []
for bbox in bboxes:
diff --git a/ppocr/data/imaug/drrg_targets.py b/ppocr/data/imaug/drrg_targets.py
index c56e878b837328ef2efde40b96b5571dffbb4791..7fdfd096819b266290353d842ef531e8220c586c 100644
--- a/ppocr/data/imaug/drrg_targets.py
+++ b/ppocr/data/imaug/drrg_targets.py
@@ -18,7 +18,7 @@ https://github.com/open-mmlab/mmocr/blob/main/mmocr/datasets/pipelines/textdet_t
import cv2
import numpy as np
-from lanms import merge_quadrangle_n9 as la_nms
+from ppocr.utils.utility import check_install
from numpy.linalg import norm
@@ -543,6 +543,8 @@ class DRRGTargets(object):
score = np.ones((text_comps.shape[0], 1), dtype=np.float32)
text_comps = np.hstack([text_comps, score])
+ check_install('lanms', 'lanms-neo')
+ from lanms import merge_quadrangle_n9 as la_nms
text_comps = la_nms(text_comps, self.text_comp_nms_thr)
if text_comps.shape[0] >= 1:
diff --git a/ppocr/postprocess/east_postprocess.py b/ppocr/postprocess/east_postprocess.py
index c194c81c6911aac0f9210109c37b76b44532e9c4..e9ba095d3993ecb614b9569a7bb3c743268773ea 100755
--- a/ppocr/postprocess/east_postprocess.py
+++ b/ppocr/postprocess/east_postprocess.py
@@ -22,6 +22,7 @@ import cv2
import paddle
import os
+from ppocr.utils.utility import check_install
import sys
@@ -78,11 +79,11 @@ class EASTPostProcess(object):
boxes[:, 8] = score_map[xy_text[:, 0], xy_text[:, 1]]
try:
+ check_install('lanms', 'lanms-nova')
import lanms
- boxes = lanms.merge_quadrangle_n9(boxes, nms_thresh)
except:
print(
- 'you should install lanms by pip3 install lanms-nova to speed up nms_locality'
+ 'You should install lanms by pip3 install lanms-nova to speed up nms_locality'
)
boxes = nms_locality(boxes.astype(np.float64), nms_thresh)
if boxes.shape[0] == 0:
diff --git a/ppocr/postprocess/sast_postprocess.py b/ppocr/postprocess/sast_postprocess.py
index bee75c05b1a3ea59193d566f91378c96797f533b..594bf17d6a0db2ebee17e7476834ce7b6b4289e6 100755
--- a/ppocr/postprocess/sast_postprocess.py
+++ b/ppocr/postprocess/sast_postprocess.py
@@ -141,6 +141,8 @@ class SASTPostProcess(object):
def nms(self, dets):
if self.is_python35:
+ from ppocr.utils.utility import check_install
+ check_install('lanms', 'lanms-nova')
import lanms
dets = lanms.merge_quadrangle_n9(dets, self.nms_thresh)
else:
diff --git a/ppocr/utils/e2e_metric/Deteval.py b/ppocr/utils/e2e_metric/Deteval.py
index 6ce56eda2aa9f38fdc712d49ae64945c558b418d..387b3c24e6aa78fc2a7b4e03d5c133c3c6c61112 100755
--- a/ppocr/utils/e2e_metric/Deteval.py
+++ b/ppocr/utils/e2e_metric/Deteval.py
@@ -15,7 +15,11 @@
import json
import numpy as np
import scipy.io as io
+
+from ppocr.utils.utility import check_install
+check_install("Polygon", "Polygon3")
import Polygon as plg
+
from ppocr.utils.e2e_metric.polygon_fast import iod, area_of_intersection, area
diff --git a/ppocr/utils/utility.py b/ppocr/utils/utility.py
index 18357c8e97bcea8ee321856a87146a4a7b901469..0f8660ceb4f0297f3e5c81d3b60f53cecab42070 100755
--- a/ppocr/utils/utility.py
+++ b/ppocr/utils/utility.py
@@ -19,6 +19,9 @@ import cv2
import random
import numpy as np
import paddle
+import importlib.util
+import sys
+import subprocess
def print_dict(d, logger, delimiter=0):
@@ -131,6 +134,26 @@ def set_seed(seed=1024):
paddle.seed(seed)
+def check_install(module_name, install_name):
+ spec = importlib.util.find_spec(module_name)
+ if spec is None:
+ print(f'Warnning! The {module_name} module is NOT installed')
+ print(
+ f'Try install {module_name} module automatically. You can also try to install manually by pip install {install_name}.'
+ )
+ python = sys.executable
+ try:
+ subprocess.check_call(
+ [python, '-m', 'pip', 'install', install_name],
+ stdout=subprocess.DEVNULL)
+ print(f'The {module_name} module is now installed')
+ except subprocess.CalledProcessError as exc:
+ raise Exception(
+ f"Install {module_name} failed, please install manually")
+ else:
+ print(f"{module_name} has been installed.")
+
+
class AverageMeter:
def __init__(self):
self.reset()
diff --git a/requirements.txt b/requirements.txt
index 9b73e3bf8bb4fbbed1961aa4d76ab3a6b8e8e3a5..f3d9ce89e3e2ae9079598d37f75b9e4e63d871a6 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -13,7 +13,5 @@ cython
lxml
premailer
openpyxl
-attrdict3
-Polygon3
-lanms-neo==1.0.2
+attrdict
PyMuPDF==1.19.0
\ No newline at end of file