diff --git a/deploy/cpp_infer/readme.md b/deploy/cpp_infer/readme.md index a87db7e6596bc2528bfb4a93c3170ebf0482ccad..545924c5ce5e33bd35c0c49eaef40bd2f06fabc6 100644 --- a/deploy/cpp_infer/readme.md +++ b/deploy/cpp_infer/readme.md @@ -171,6 +171,9 @@ inference/ |-- cls | |--inference.pdiparams | |--inference.pdmodel +|-- table +| |--inference.pdiparams +| |--inference.pdmodel ``` @@ -275,6 +278,22 @@ Specifically, --cls=true \ ``` + +##### 7. table +```shell +./build/ppocr --det_model_dir=inference/det_db \ + --rec_model_dir=inference/rec_rcnn \ + --cls_model_dir=inference/cls \ + --table_model_dir=inference/table \ + --image_dir=../../ppstructure/docs/table/table.jpg \ + --use_angle_cls=true \ + --det=true \ + --rec=true \ + --cls=true \ + --type=structure \ + --table=true +``` + More parameters are as follows, - Common parameters @@ -293,9 +312,9 @@ More parameters are as follows, |parameter|data type|default|meaning| | :---: | :---: | :---: | :---: | -|det|bool|true|前向是否执行文字检测| -|rec|bool|true|前向是否执行文字识别| -|cls|bool|false|前向是否执行文字方向分类| +|det|bool|true|Whether to perform text detection in the forward direction| +|rec|bool|true|Whether to perform text recognition in the forward direction| +|cls|bool|false|Whether to perform text direction classification in the forward direction| - Detection related parameters @@ -329,6 +348,15 @@ More parameters are as follows, |rec_img_h|int|48|image height of recognition| |rec_img_w|int|320|image width of recognition| +- Table recognition related parameters + +|parameter|data type|default|meaning| +| :---: | :---: | :---: | :---: | +|table_model_dir|string|-|Address of table recognition inference model| +|table_char_dict_path|string|../../ppocr/utils/dict/table_structure_dict.txt|dictionary file| +|table_max_len|int|488|The size of the long side of the input image of the table recognition model, the final input image size of the network is(table_max_len,table_max_len)| + + * Multi-language inference is also supported in PaddleOCR, you can refer to [recognition tutorial](../../doc/doc_en/recognition_en.md) for more supported languages and models in PaddleOCR. Specifically, if you want to infer using multi-language models, you just need to modify values of `rec_char_dict_path` and `rec_model_dir`. @@ -344,6 +372,12 @@ predict img: ../../doc/imgs/12.jpg The detection visualized image saved in ./output//12.jpg ``` +- table + +```bash +predict img: ../../ppstructure/docs/table/table.jpg +0 type: table, region: [0,0,371,293], res:
MethodsRPFFPS
SegLink [26]70.086.077.08.9
PixelLink [4]73.283.077.8-
TextSnake [18]73.983.278.31.1
TextField [37]75.987.481.35.2
MSR[38]76.787.481.7-
FTSN [3]77.187.682.0-
LSE[30]81.784.282.9-
CRAFT [2]78.288.282.98.6
MCN [16]798883-
ATRR[35]82.185.283.6-
PAN [34]83.884.484.130.2
DB[12]79.291.584.932.0
DRRG [41]82.3088.0585.08-
Ours (SynText)80.6885.4082.9712.68
Ours (MLT-17)84.5486.6285.5712.31
+``` ## 3. FAQ diff --git a/deploy/cpp_infer/readme_ch.md b/deploy/cpp_infer/readme_ch.md index 8c334851c0d44acd393c6daa79edf25dc9e6fa24..fb994a5b41a0e466f7dfb2f00eabe4bf556e642c 100644 --- a/deploy/cpp_infer/readme_ch.md +++ b/deploy/cpp_infer/readme_ch.md @@ -181,6 +181,9 @@ inference/ |-- cls | |--inference.pdiparams | |--inference.pdmodel +|-- table +| |--inference.pdiparams +| |--inference.pdmodel ``` @@ -285,6 +288,21 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir --cls=true \ ``` +##### 7. 表格识别 +```shell +./build/ppocr --det_model_dir=inference/det_db \ + --rec_model_dir=inference/rec_rcnn \ + --cls_model_dir=inference/cls \ + --table_model_dir=inference/table \ + --image_dir=../../ppstructure/docs/table/table.jpg \ + --use_angle_cls=true \ + --det=true \ + --rec=true \ + --cls=true \ + --type=structure \ + --table=true +``` + 更多支持的可调节参数解释如下: - 通用参数 @@ -328,21 +346,32 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir |cls_thresh|float|0.9|方向分类器的得分阈值| |cls_batch_num|int|1|方向分类器batchsize| -- 识别模型相关 +- 文字识别模型相关 |参数名称|类型|默认参数|意义| | :---: | :---: | :---: | :---: | -|rec_model_dir|string|-|识别模型inference model地址| +|rec_model_dir|string|-|文字识别模型inference model地址| |rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|字典文件| -|rec_batch_num|int|6|识别模型batchsize| -|rec_img_h|int|48|识别模型输入图像高度| -|rec_img_w|int|320|识别模型输入图像宽度| +|rec_batch_num|int|6|文字识别模型batchsize| +|rec_img_h|int|48|文字识别模型输入图像高度| +|rec_img_w|int|320|文字识别模型输入图像宽度| + + +- 表格识别模型相关 + +|参数名称|类型|默认参数|意义| +| :---: | :---: | :---: | :---: | +|table_model_dir|string|-|表格识别模型inference model地址| +|table_char_dict_path|string|../../ppocr/utils/dict/table_structure_dict.txt|字典文件| +|table_max_len|int|488|表格识别模型输入图像长边大小,最终网络输入图像大小为(table_max_len,table_max_len)| * PaddleOCR也支持多语言的预测,更多支持的语言和模型可以参考[识别文档](../../doc/doc_ch/recognition.md)中的多语言字典与模型部分,如果希望进行多语言预测,只需将修改`rec_char_dict_path`(字典文件路径)以及`rec_model_dir`(inference模型路径)字段即可。 最终屏幕上会输出检测结果如下。 +- ocr + ```bash predict img: ../../doc/imgs/12.jpg ../../doc/imgs/12.jpg @@ -353,6 +382,13 @@ predict img: ../../doc/imgs/12.jpg The detection visualized image saved in ./output//12.jpg ``` +- table + +```bash +predict img: ../../ppstructure/docs/table/table.jpg +0 type: table, region: [0,0,371,293], res:
MethodsRPFFPS
SegLink [26]70.086.077.08.9
PixelLink [4]73.283.077.8-
TextSnake [18]73.983.278.31.1
TextField [37]75.987.481.35.2
MSR[38]76.787.481.7-
FTSN [3]77.187.682.0-
LSE[30]81.784.282.9-
CRAFT [2]78.288.282.98.6
MCN [16]798883-
ATRR[35]82.185.283.6-
PAN [34]83.884.484.130.2
DB[12]79.291.584.932.0
DRRG [41]82.3088.0585.08-
Ours (SynText)80.6885.4082.9712.68
Ours (MLT-17)84.5486.6285.5712.31
+``` + ## 3. FAQ diff --git a/deploy/cpp_infer/src/main.cpp b/deploy/cpp_infer/src/main.cpp index aa35eca3f138bee69b270cae6b976f3aa9874b33..66412a7b283f84107e117cfd59fb7d7aabff651c 100644 --- a/deploy/cpp_infer/src/main.cpp +++ b/deploy/cpp_infer/src/main.cpp @@ -119,7 +119,7 @@ void structure(std::vector &cv_all_img_names) { std::vector> structure_results = engine.structure(cv_all_img_names, false, FLAGS_table); for (int i = 0; i < cv_all_img_names.size(); i++) { - cout << cv_all_img_names[i] << "\n"; + cout << "predict img: " << cv_all_img_names[i] << endl; for (int j = 0; j < structure_results[i].size(); j++) { std::cout << j << "\ttype: " << structure_results[i][j].type << ", region: ["; @@ -127,7 +127,7 @@ void structure(std::vector &cv_all_img_names) { << structure_results[i][j].box[1] << "," << structure_results[i][j].box[2] << "," << structure_results[i][j].box[3] << "], res: "; - if (structure_results[i][j].type == "Table") { + if (structure_results[i][j].type == "table") { std::cout << structure_results[i][j].html << std::endl; } else { Utility::print_result(structure_results[i][j].text_res); diff --git a/deploy/cpp_infer/src/paddlestructure.cpp b/deploy/cpp_infer/src/paddlestructure.cpp index dbaa84fe8454cbe33f04a3b4328c6e9bf7c54943..1ca85a96bbcf09472ce5916375a24a9441a2da53 100644 --- a/deploy/cpp_infer/src/paddlestructure.cpp +++ b/deploy/cpp_infer/src/paddlestructure.cpp @@ -55,7 +55,7 @@ PaddleStructure::structure(std::vector cv_all_img_names, if (layout) { } else { StructurePredictResult res; - res.type = "Table"; + res.type = "table"; res.box = std::vector(4, 0); res.box[2] = srcimg.cols; res.box[3] = srcimg.rows; @@ -65,7 +65,7 @@ PaddleStructure::structure(std::vector cv_all_img_names, for (int i = 0; i < structure_result.size(); i++) { // crop image roi_img = Utility::crop_image(srcimg, structure_result[i].box); - if (structure_result[i].type == "Table") { + if (structure_result[i].type == "table") { this->table(roi_img, structure_result[i], time_info_table, time_info_det, time_info_rec, time_info_cls); }