From 16eb7516b4e6c91d8ddae7df82169d4f482810c2 Mon Sep 17 00:00:00 2001 From: andyjpaddle Date: Fri, 29 Apr 2022 06:26:03 +0000 Subject: [PATCH] add rec image shape note for v3 rec --- deploy/cpp_infer/readme.md | 2 ++ deploy/cpp_infer/readme_ch.md | 3 +++ doc/doc_ch/inference_ppocr.md | 4 ++++ doc/doc_ch/quickstart.md | 2 ++ doc/doc_ch/whl.md | 2 ++ doc/doc_en/inference_ppocr_en.md | 5 +++++ doc/doc_en/quickstart_en.md | 2 ++ doc/doc_en/whl_en.md | 2 ++ 8 files changed, 22 insertions(+) diff --git a/deploy/cpp_infer/readme.md b/deploy/cpp_infer/readme.md index 41887906..d37a708b 100644 --- a/deploy/cpp_infer/readme.md +++ b/deploy/cpp_infer/readme.md @@ -208,6 +208,8 @@ Execute the built executable file: ./build/ppocr [--param1] [--param2] [...] ``` +**Note**:ppocr uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3, 48, 320`, so if you use the recognition function, you need to add the parameter `--rec_img_h=48`, if you do not use the default `PP-OCRv3` model, you do not need to set this parameter. + Specifically, ##### 1. det+cls+rec: diff --git a/deploy/cpp_infer/readme_ch.md b/deploy/cpp_infer/readme_ch.md index cf14a676..9462eb70 100644 --- a/deploy/cpp_infer/readme_ch.md +++ b/deploy/cpp_infer/readme_ch.md @@ -213,6 +213,9 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir 本demo支持系统串联调用,也支持单个功能的调用,如,只使用检测或识别功能。 +**注意** ppocr默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 因此如果使用识别功能,需要添加参数`--rec_img_h=48`,如果不使用默认的`PP-OCRv3`模型,则无需设置该参数。 + + 运行方式: ```shell ./build/ppocr [--param1] [--param2] [...] diff --git a/doc/doc_ch/inference_ppocr.md b/doc/doc_ch/inference_ppocr.md index 23e9f3b6..94630755 100644 --- a/doc/doc_ch/inference_ppocr.md +++ b/doc/doc_ch/inference_ppocr.md @@ -59,6 +59,8 @@ python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_di ### 2.1 超轻量中文识别模型推理 +**注意** `PP-OCRv3`的识别模型使用的输入shape为`3,48,320`, 需要添加参数`--rec_image_shape=3,48,320`,如果不使用`PP-OCRv3`的识别模型,则无需设置该参数。 + 超轻量中文识别模型推理,可以执行如下命令: ``` @@ -119,6 +121,8 @@ Predicts of ./doc/imgs_words/ch/word_4.jpg:['0', 0.9999982] ## 4. 文本检测、方向分类和文字识别串联推理 +**注意** `PP-OCRv3`的识别模型使用的输入shape为`3,48,320`, 需要添加参数`--rec_image_shape=3,48,320`,如果不使用`PP-OCRv3`的识别模型,则无需设置该参数。 + 以超轻量中文OCR模型推理为例,在执行预测时,需要通过参数`image_dir`指定单张图像或者图像集合的路径、参数`det_model_dir`,`cls_model_dir`和`rec_model_dir`分别指定检测,方向分类和识别的inference模型路径。参数`use_angle_cls`用于控制是否启用方向分类模型。`use_mp`表示是否使用多进程。`total_process_num`表示在使用多进程时的进程数。可视化识别结果默认保存到 ./inference_results 文件夹里面。 ```shell diff --git a/doc/doc_ch/quickstart.md b/doc/doc_ch/quickstart.md index c41186a2..6301755d 100644 --- a/doc/doc_ch/quickstart.md +++ b/doc/doc_ch/quickstart.md @@ -59,6 +59,8 @@ cd /path/to/ppocr_img 如果不使用提供的测试图片,可以将下方`--image_dir`参数替换为相应的测试图片路径。 +**注意** whl包默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 因此如果使用识别功能,需要添加参数`--rec_image_shape 3,48,320`,如果不使用默认的`PP-OCRv3`模型,则无需设置该参数。 + #### 2.1.1 中英文模型 diff --git a/doc/doc_ch/whl.md b/doc/doc_ch/whl.md index ba571186..7d8cc26f 100644 --- a/doc/doc_ch/whl.md +++ b/doc/doc_ch/whl.md @@ -199,6 +199,8 @@ for line in result: paddleocr -h ``` +**注意** whl包默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 因此如果使用识别功能,需要添加参数`--rec_image_shape 3,48,320`,如果不使用默认的`PP-OCRv3`模型,则无需设置该参数。 + * 检测+方向分类器+识别全流程 ```bash diff --git a/doc/doc_en/inference_ppocr_en.md b/doc/doc_en/inference_ppocr_en.md index bcbe3206..dcfb738e 100755 --- a/doc/doc_en/inference_ppocr_en.md +++ b/doc/doc_en/inference_ppocr_en.md @@ -56,6 +56,9 @@ python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_di ### 1. Lightweight Chinese Recognition Model Inference +**Note**: The input shape used by the recognition model of `PP-OCRv3` is `3,48,320`, and the parameter `--rec_image_shape=3,48,320` needs to be added. If the recognition model of `PP-OCRv3` is not used, this parameter does not need to be set. + + For lightweight Chinese recognition model inference, you can execute the following commands: ``` @@ -117,6 +120,8 @@ After executing the command, the prediction results (classification angle and sc ## Text Detection Angle Classification and Recognition Inference Concatenation +**Note**: The input shape used by the recognition model of `PP-OCRv3` is `3,48,320`, and the parameter `--rec_image_shape=3,48,320` needs to be added. If the recognition model of `PP-OCRv3` is not used, this parameter does not need to be set. + When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `cls_model_dir` specifies the path to angle classification inference model and the parameter `rec_model_dir` specifies the path to identify the inference model. The parameter `use_angle_cls` is used to control whether to enable the angle classification model. The parameter `use_mp` specifies whether to use multi-process to infer `total_process_num` specifies process number when using multi-process. The parameter . The visualized recognition results are saved to the `./inference_results` folder by default. ```shell diff --git a/doc/doc_en/quickstart_en.md b/doc/doc_en/quickstart_en.md index 0a420dd4..7243e2db 100644 --- a/doc/doc_en/quickstart_en.md +++ b/doc/doc_en/quickstart_en.md @@ -73,6 +73,8 @@ cd /path/to/ppocr_img If you do not use the provided test image, you can replace the following `--image_dir` parameter with the corresponding test image path +**Note**: The whl package uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3,48,320`, so if you use the recognition function, you need to add the parameter `--rec_image_shape 3,48,320`, if you do not use the default `PP- OCRv3` model, you do not need to set this parameter. + #### 2.1.1 Chinese and English Model diff --git a/doc/doc_en/whl_en.md b/doc/doc_en/whl_en.md index 670653f1..50edfa25 100644 --- a/doc/doc_en/whl_en.md +++ b/doc/doc_en/whl_en.md @@ -172,6 +172,8 @@ show help information paddleocr -h ``` +**Note**: The whl package uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3,48,320`, so if you use the recognition function, you need to add the parameter `--rec_image_shape 3,48,320`, if you do not use the default `PP- OCRv3` model, you do not need to set this parameter. + * detection classification and recognition ```bash paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true --lang en --rec_image_shape 3,48,320 -- GitLab