diff --git a/deploy/cpp_infer/readme.md b/deploy/cpp_infer/readme.md index ddd15d49558454a5ffb0731665b118c929e607f0..942a633fd34ca2722d894eb3f4d4018fd6e48432 100644 --- a/deploy/cpp_infer/readme.md +++ b/deploy/cpp_infer/readme.md @@ -208,7 +208,7 @@ Execute the built executable file: ./build/ppocr [--param1] [--param2] [...] ``` -**Note**:ppocr uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3, 48, 320`, so if you use the recognition function, you need to add the parameter `--rec_img_h=48`, if you do not use the default `PP-OCRv3` model, you do not need to set this parameter. +**Note**:ppocr uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3, 48, 320`, if you do not use the default `PP-OCRv3` model, you should add the parameter `--rec_img_h=32`. Specifically, @@ -222,7 +222,6 @@ Specifically, --det=true \ --rec=true \ --cls=true \ - --rec_img_h=48\ ``` ##### 2. det+rec: @@ -234,7 +233,6 @@ Specifically, --det=true \ --rec=true \ --cls=false \ - --rec_img_h=48\ ``` ##### 3. det @@ -254,7 +252,6 @@ Specifically, --det=false \ --rec=true \ --cls=true \ - --rec_img_h=48\ ``` ##### 5. rec @@ -265,7 +262,6 @@ Specifically, --det=false \ --rec=true \ --cls=false \ - --rec_img_h=48\ ``` ##### 6. cls @@ -330,7 +326,7 @@ More parameters are as follows, |rec_model_dir|string|-|Address of recognition inference model| |rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|dictionary file| |rec_batch_num|int|6|batch size of recognition| -|rec_img_h|int|32|image height of recognition| +|rec_img_h|int|48|image height of recognition| |rec_img_w|int|320|image width of recognition| * Multi-language inference is also supported in PaddleOCR, you can refer to [recognition tutorial](../../doc/doc_en/recognition_en.md) for more supported languages and models in PaddleOCR. Specifically, if you want to infer using multi-language models, you just need to modify values of `rec_char_dict_path` and `rec_model_dir`. diff --git a/deploy/cpp_infer/readme_ch.md b/deploy/cpp_infer/readme_ch.md index e5a4869eca1d35765013e63011c680e59b33ac00..4a79f978fb46a755a7f3d3e04016b4de22ebe24b 100644 --- a/deploy/cpp_infer/readme_ch.md +++ b/deploy/cpp_infer/readme_ch.md @@ -213,7 +213,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir 本demo支持系统串联调用,也支持单个功能的调用,如,只使用检测或识别功能。 -**注意** ppocr默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 因此如果使用识别功能,需要添加参数`--rec_img_h=48`,如果不使用默认的`PP-OCRv3`模型,则无需设置该参数。 +**注意** ppocr默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 如果不使用默认的`PP-OCRv3`模型,则需要设置参数`--rec_img_h=32`。 运行方式: @@ -232,7 +232,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir --det=true \ --rec=true \ --cls=true \ - --rec_img_h=48\ ``` ##### 2. 检测+识别: @@ -244,7 +243,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir --det=true \ --rec=true \ --cls=false \ - --rec_img_h=48\ ``` ##### 3. 检测: @@ -264,7 +262,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir --det=false \ --rec=true \ --cls=true \ - --rec_img_h=48\ ``` ##### 5. 识别: @@ -275,7 +272,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir --det=false \ --rec=true \ --cls=false \ - --rec_img_h=48\ ``` ##### 6. 分类: @@ -339,7 +335,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir |rec_model_dir|string|-|识别模型inference model地址| |rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|字典文件| |rec_batch_num|int|6|识别模型batchsize| -|rec_img_h|int|32|识别模型输入图像高度| +|rec_img_h|int|48|识别模型输入图像高度| |rec_img_w|int|320|识别模型输入图像宽度| diff --git a/deploy/cpp_infer/src/args.cpp b/deploy/cpp_infer/src/args.cpp index fe58236734568035dfb26570df39f21154f4e9ef..93d0f5ea5fd07bdc3eb44537bc1c0d4e131736d3 100644 --- a/deploy/cpp_infer/src/args.cpp +++ b/deploy/cpp_infer/src/args.cpp @@ -47,7 +47,7 @@ DEFINE_string(rec_model_dir, "", "Path of rec inference model."); DEFINE_int32(rec_batch_num, 6, "rec_batch_num."); DEFINE_string(rec_char_dict_path, "../../ppocr/utils/ppocr_keys_v1.txt", "Path of dictionary."); -DEFINE_int32(rec_img_h, 32, "rec image height"); +DEFINE_int32(rec_img_h, 48, "rec image height"); DEFINE_int32(rec_img_w, 320, "rec image width"); // ocr forward related diff --git a/deploy/cpp_infer/src/ocr_rec.cpp b/deploy/cpp_infer/src/ocr_rec.cpp index f69f37b8f51ecec5925d556f2b3e169bb0e80715..31a1a884a1aa25134d19e80f9ddac9bc35637fba 100644 --- a/deploy/cpp_infer/src/ocr_rec.cpp +++ b/deploy/cpp_infer/src/ocr_rec.cpp @@ -132,7 +132,9 @@ void CRNNRecognizer::LoadModel(const std::string &model_dir) { paddle_infer::Config config; config.SetModel(model_dir + "/inference.pdmodel", model_dir + "/inference.pdiparams"); - + std::cout << "In PP-OCRv3, default rec_img_h is 48," + << "if you use other model, you should set the param rec_img_h=32" + << std::endl; if (this->use_gpu_) { config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); if (this->use_tensorrt_) { diff --git a/doc/doc_ch/quickstart.md b/doc/doc_ch/quickstart.md index 29ca48fa838be4a60f08d31d5031180b951e33bc..d2c2f288b02810ce4224ad06c98fdc2fe6096d63 100644 --- a/doc/doc_ch/quickstart.md +++ b/doc/doc_ch/quickstart.md @@ -101,8 +101,17 @@ cd /path/to/ppocr_img ['韩国小馆', 0.994467] ``` +**版本说明** +paddleocr默认使用PP-OCRv3模型(`--ocr_version PP-OCRv3`),如需使用其他版本可通过设置参数`--ocr_version`,具体版本说明如下: +| 版本名称 | 版本说明 | +| --- | --- | +| PP-OCRv3 | 支持中、英文检测和识别,方向分类器,支持多语种识别 | +| PP-OCRv2 | 仅支持中英文的检测和识别 | +| PP-OCR | 支持中、英文检测和识别,方向分类器,支持多语种识别 | -如需使用2.0模型,请指定参数`--ocr_version PP-OCR`,paddleocr默认使用PP-OCRv3模型(`--ocr_version PP-OCRv3`)。更多whl包使用可参考[whl包文档](./whl.md) +如需新增自己训练的模型,可以在[paddleocr](../../paddleocr.py)中增加模型链接和字段,重新编译即可。 + +更多whl包使用可参考[whl包文档](./whl.md) diff --git a/doc/doc_en/quickstart_en.md b/doc/doc_en/quickstart_en.md index 53f4313579bf39204e085bb0a90d219e3a1c1b5d..fa59054daad19c9cf647c8bde43ca38e6a84db97 100644 --- a/doc/doc_en/quickstart_en.md +++ b/doc/doc_en/quickstart_en.md @@ -119,7 +119,18 @@ If you do not use the provided test image, you can replace the following `--imag ['PAIN', 0.9934559464454651] ``` -If you need to use the 2.0 model, please specify the parameter `--ocr_version PP-OCR`, paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). More whl package usage can be found in [whl package](./whl_en.md) +**Version** +paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). If you want to use other versions, you can set the parameter `--ocr_version`, the specific version description is as follows: +| version name | description | +| --- | --- | +| PP-OCRv3 | support Chinese and English detection and recognition, direction classifier, support multilingual recognition | +| PP-OCRv2 | only supports Chinese and English detection and recognition | +| PP-OCR | support Chinese and English detection and recognition, direction classifier, support multilingual recognition | + +If you want to add your own trained model, you can add model links and keys in [paddleocr](../../paddleocr.py) and recompile. + +More whl package usage can be found in [whl package](./whl_en.md) + #### 2.1.2 Multi-language Model diff --git a/ppocr/data/simple_dataset.py b/ppocr/data/simple_dataset.py index b5da9b8898423facf888839f941dff01caa03643..402f1e38fed9e32722e2dd160f10f779028807a3 100644 --- a/ppocr/data/simple_dataset.py +++ b/ppocr/data/simple_dataset.py @@ -33,7 +33,7 @@ class SimpleDataSet(Dataset): self.delimiter = dataset_config.get('delimiter', '\t') label_file_list = dataset_config.pop('label_file_list') data_source_num = len(label_file_list) - ratio_list = dataset_config.get("ratio_list", [1.0]) + ratio_list = dataset_config.get("ratio_list", 1.0) if isinstance(ratio_list, (float, int)): ratio_list = [float(ratio_list)] * int(data_source_num)