提交 d8b33ba1 编写于 作者: 文幕地方's avatar 文幕地方

add pse curved text detection doc

上级 16bacf03
...@@ -52,17 +52,27 @@ ...@@ -52,17 +52,27 @@
python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse
``` ```
PSE文本检测模型推理,可以执行如下命令: PSE文本检测模型推理,执行非弯曲文本检测,可以执行如下命令:
```shell ```shell
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=quad
``` ```
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下: 可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
![](../imgs_results/det_res_img_10_pse.jpg) ![](../imgs_results/det_res_img_10_pse.jpg)
**注意**:由于ICDAR2015数据集只有1000张训练图像,且主要针对英文场景,所以上述模型对中文文本图像检测效果会比较差。 如果想执行弯曲文本检测,可以执行如下命令:
```shell
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=poly
```
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
![](../imgs_results/det_res_img_10_pse_poly.jpg)
**注意**:由于ICDAR2015数据集只有1000张训练图像,且主要针对英文场景,所以上述模型对中文或弯曲文本图像检测效果会比较差。
<a name="4-2"></a> <a name="4-2"></a>
### 4.2 C++推理 ### 4.2 C++推理
......
...@@ -52,17 +52,27 @@ First, convert the model saved in the PSE text detection training process into a ...@@ -52,17 +52,27 @@ First, convert the model saved in the PSE text detection training process into a
python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse
``` ```
PSE text detection model inference, you can execute the following command: PSE text detection model inference, to perform non-curved text detection, you can run the following commands:
```shell ```shell
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=quad
``` ```
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows: The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_img_10_pse.jpg) ![](../imgs_results/det_res_img_10_pse.jpg)
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images. If you want to perform curved text detection, you can execute the following command:
```shell
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=poly
```
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_img_10_pse_poly.jpg)
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese or curved text images.
<a name="4-2"></a> <a name="4-2"></a>
......
...@@ -158,7 +158,7 @@ class TextDetector(object): ...@@ -158,7 +158,7 @@ class TextDetector(object):
rect[1] = pts[np.argmin(diff)] rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)] rect[3] = pts[np.argmax(diff)]
return rect return rect
def clip_det_res(self, points, img_height, img_width): def clip_det_res(self, points, img_height, img_width):
for pno in range(points.shape[0]): for pno in range(points.shape[0]):
points[pno, 0] = int(min(max(points[pno, 0], 0), img_width - 1)) points[pno, 0] = int(min(max(points[pno, 0], 0), img_width - 1))
...@@ -284,7 +284,7 @@ if __name__ == "__main__": ...@@ -284,7 +284,7 @@ if __name__ == "__main__":
total_time += elapse total_time += elapse
count += 1 count += 1
save_pred = os.path.basename(image_file) + "\t" + str( save_pred = os.path.basename(image_file) + "\t" + str(
json.dumps(np.array(dt_boxes).astype(np.int32).tolist())) + "\n" json.dumps([x.tolist() for x in dt_boxes])) + "\n"
save_results.append(save_pred) save_results.append(save_pred)
logger.info(save_pred) logger.info(save_pred)
logger.info("The predict time of {}: {}".format(image_file, elapse)) logger.info("The predict time of {}: {}".format(image_file, elapse))
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册