提交 adda5ddc 编写于 作者: 文幕地方's avatar 文幕地方

fix bug

上级 d1b31bf8
...@@ -107,4 +107,4 @@ ...@@ -107,4 +107,4 @@
|模型|骨干网络|配置文件|acc|下载链接| |模型|骨干网络|配置文件|acc|下载链接|
|---|---|---|---|---| |---|---|---|---|---|
|TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[训练模型]|[训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar)/[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)| |TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar) / [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)|
...@@ -55,7 +55,7 @@ python3 tools/export_model.py -c configs/table/table_master.yml -o Global.pretra ...@@ -55,7 +55,7 @@ python3 tools/export_model.py -c configs/table/table_master.yml -o Global.pretra
转换成功后,在目录下有三个文件: 转换成功后,在目录下有三个文件:
``` ```
/inference/table_master/ ./inference/table_master/
├── inference.pdiparams # 识别inference模型的参数文件 ├── inference.pdiparams # 识别inference模型的参数文件
├── inference.pdiparams.info # 识别inference模型的参数信息,可忽略 ├── inference.pdiparams.info # 识别inference模型的参数信息,可忽略
└── inference.pdmodel # 识别inference模型的program文件 └── inference.pdmodel # 识别inference模型的program文件
......
# OCR Algorithms # OCR Algorithms
- [1. Two-stage Algorithms](#1) - [1. Two-stage Algorithms](#1-two-stage-algorithms)
* [1.1 Text Detection Algorithms](#11) - [1.1 Text Detection Algorithms](#11-text-detection-algorithms)
* [1.2 Text Recognition Algorithms](#12) - [1.2 Text Recognition Algorithms](#12-text-recognition-algorithms)
- [2. End-to-end Algorithms](#2) - [2. End-to-end Algorithms](#2-end-to-end-algorithms)
- [3. Table Recognition Algorithms](#3) - [3. Table Recognition Algorithms](#3-table-recognition-algorithms)
This tutorial lists the OCR algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md). This tutorial lists the OCR algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md).
...@@ -107,4 +107,4 @@ On the PubTabNet dataset, the algorithm result is as follows: ...@@ -107,4 +107,4 @@ On the PubTabNet dataset, the algorithm result is as follows:
|Model|Backbone|Config|Acc|Download link| |Model|Backbone|Config|Acc|Download link|
|---|---|---|---|---| |---|---|---|---|---|
|TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[训练模型]|[训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar)/[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)| |TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[trained](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar) / [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)|
# Torm Recognition Algorithm-TableMASTER # Table Recognition Algorithm-TableMASTER
- [1. Introduction](#1-introduction) - [1. Introduction](#1-introduction)
- [2. Environment](#2-environment) - [2. Environment](#2-environment)
...@@ -24,7 +24,7 @@ On the PubTabNet table recognition public data set, the algorithm reproduction a ...@@ -24,7 +24,7 @@ On the PubTabNet table recognition public data set, the algorithm reproduction a
|Model|Backbone|Cnnfig|Acc|Download link| |Model|Backbone|Cnnfig|Acc|Download link|
| --- | --- | --- | --- | --- | | --- | --- | --- | --- | --- |
|TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[train model](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar)/[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)| |TableMaster|TableResNetExtra|[configs/table/table_master.yml](../../configs/table/table_master.yml)|77.47%|[trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_train.tar)/[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/tablemaster/table_structure_tablemaster_infer.tar)|
<a name="2"></a> <a name="2"></a>
......
...@@ -671,7 +671,7 @@ class TableLabelEncode(AttnLabelEncode): ...@@ -671,7 +671,7 @@ class TableLabelEncode(AttnLabelEncode):
def _merge_no_span_structure(self, structure): def _merge_no_span_structure(self, structure):
""" """
This fun code is refer from: This code is refer from:
https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/table_recognition/data_preprocess.py https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/table_recognition/data_preprocess.py
""" """
new_structure = [] new_structure = []
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
This fun code is refer from: This code is refer from:
https://github.com/JiaquanYe/TableMASTER-mmocr/tree/master/mmocr/models/textrecog/losses https://github.com/JiaquanYe/TableMASTER-mmocr/tree/master/mmocr/models/textrecog/losses
""" """
......
...@@ -31,8 +31,6 @@ class TableStructureMetric(object): ...@@ -31,8 +31,6 @@ class TableStructureMetric(object):
gt_structure_batch_list): gt_structure_batch_list):
pred_str = ''.join(pred) pred_str = ''.join(pred)
target_str = ''.join(target) target_str = ''.join(target)
# pred_str = pred_str.replace('<thead>','').replace('</thead>','').replace('<tbody>','').replace('</tbody>','')
# target_str = target_str.replace('<thead>','').replace('</thead>','').replace('<tbody>','').replace('</tbody>','')
if pred_str == target_str: if pred_str == target_str:
correct_num += 1 correct_num += 1
all_num += 1 all_num += 1
...@@ -55,8 +53,6 @@ class TableStructureMetric(object): ...@@ -55,8 +53,6 @@ class TableStructureMetric(object):
self.len_acc_num = 0 self.len_acc_num = 0
self.token_nums = 0 self.token_nums = 0
self.anys_dict = dict() self.anys_dict = dict()
from collections import defaultdict
self.error_num_dict = defaultdict(int)
class TableMetric(object): class TableMetric(object):
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
This fun code is refer from: This code is refer from:
https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/mmocr/models/textrecog/backbones/table_resnet_extra.py https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/mmocr/models/textrecog/backbones/table_resnet_extra.py
""" """
...@@ -193,47 +193,43 @@ class TableResNetExtra(nn.Layer): ...@@ -193,47 +193,43 @@ class TableResNetExtra(nn.Layer):
def forward(self, x): def forward(self, x):
f = [] f = []
x = self.conv1(x) # 1,64,480,480 x = self.conv1(x)
x = self.bn1(x) x = self.bn1(x)
x = self.relu1(x) x = self.relu1(x)
x = self.conv2(x) # 1,128,480,480 x = self.conv2(x)
x = self.bn2(x) x = self.bn2(x)
x = self.relu2(x) x = self.relu2(x)
# (48, 160)
x = self.maxpool1(x) # 1,64,240,240 x = self.maxpool1(x)
x = self.layer1(x) x = self.layer1(x)
x = self.conv3(x) # 1,256,240,240 x = self.conv3(x)
x = self.bn3(x) x = self.bn3(x)
x = self.relu3(x) x = self.relu3(x)
f.append(x) f.append(x)
# (24, 80)
x = self.maxpool2(x) # 1,256,120,120 x = self.maxpool2(x)
x = self.layer2(x) x = self.layer2(x)
x = self.conv4(x) # 1,256,120,120 x = self.conv4(x)
x = self.bn4(x) x = self.bn4(x)
x = self.relu4(x) x = self.relu4(x)
f.append(x) f.append(x)
# (12, 40)
x = self.maxpool3(x) # 1,256,60,60 x = self.maxpool3(x)
x = self.layer3(x) # 1,512,60,60 x = self.layer3(x)
x = self.conv5(x) # 1,512,60,60 x = self.conv5(x)
x = self.bn5(x) x = self.bn5(x)
x = self.relu5(x) x = self.relu5(x)
x = self.layer4(x) # 1,512,60,60 x = self.layer4(x)
x = self.conv6(x) # 1,512,60,60 x = self.conv6(x)
x = self.bn6(x) x = self.bn6(x)
x = self.relu6(x) x = self.relu6(x)
f.append(x) f.append(x)
# (6, 40)
return f return f
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
This fun code is refer from: This code is refer from:
https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/mmocr/models/textrecog/decoders/master_decoder.py https://github.com/JiaquanYe/TableMASTER-mmocr/blob/master/mmocr/models/textrecog/decoders/master_decoder.py
""" """
...@@ -135,7 +135,7 @@ class TableMasterHead(nn.Layer): ...@@ -135,7 +135,7 @@ class TableMasterHead(nn.Layer):
batch_size = out_enc.shape[0] batch_size = out_enc.shape[0]
SOS = paddle.zeros([batch_size, 1], dtype='int64') + self.SOS SOS = paddle.zeros([batch_size, 1], dtype='int64') + self.SOS
output, bbox_output = self.greedy_forward(SOS, out_enc) output, bbox_output = self.greedy_forward(SOS, out_enc)
# output = F.softmax(output) output = F.softmax(output)
return {'structure_probs': output, 'loc_preds': bbox_output} return {'structure_probs': output, 'loc_preds': bbox_output}
def forward(self, feat, targets=None): def forward(self, feat, targets=None):
......
...@@ -110,3 +110,16 @@ def draw_re_results(image, ...@@ -110,3 +110,16 @@ def draw_re_results(image,
img_new = Image.blend(image, img_new, 0.5) img_new = Image.blend(image, img_new, 0.5)
return np.array(img_new) return np.array(img_new)
def draw_rectangle(img_path, boxes, use_xywh=False):
img = cv2.imread(img_path)
img_show = img.copy()
for box in boxes.astype(int):
if use_xywh:
x, y, w, h = box
x1, y1, x2, y2 = x - w // 2, y - h // 2, x + w // 2, y + h // 2
else:
x1, y1, x2, y2 = box
cv2.rectangle(img_show, (x1, y1), (x2, y2), (255, 0, 0), 2)
return img_show
\ No newline at end of file
...@@ -30,6 +30,7 @@ from ppocr.data import create_operators, transform ...@@ -30,6 +30,7 @@ from ppocr.data import create_operators, transform
from ppocr.postprocess import build_post_process from ppocr.postprocess import build_post_process
from ppocr.utils.logging import get_logger from ppocr.utils.logging import get_logger
from ppocr.utils.utility import get_image_file_list, check_and_read_gif from ppocr.utils.utility import get_image_file_list, check_and_read_gif
from ppocr.utils.visual import draw_rectangle
from ppstructure.utility import parse_args from ppstructure.utility import parse_args
logger = get_logger() logger = get_logger()
...@@ -120,19 +121,6 @@ class TableStructurer(object): ...@@ -120,19 +121,6 @@ class TableStructurer(object):
return structure_str_list, bbox_list, elapse return structure_str_list, bbox_list, elapse
def draw_rectangle(img_path, boxes, use_xywh=False):
img = cv2.imread(img_path)
img_show = img.copy()
for box in boxes.astype(int):
if use_xywh:
x, y, w, h = box
x1, y1, x2, y2 = x - w // 2, y - h // 2, x + w // 2, y + h // 2
else:
x1, y1, x2, y2 = box
cv2.rectangle(img_show, (x1, y1), (x2, y2), (255, 0, 0), 2)
return img_show
def main(args): def main(args):
image_file_list = get_image_file_list(args.image_dir) image_file_list = get_image_file_list(args.image_dir)
table_structurer = TableStructurer(args) table_structurer = TableStructurer(args)
......
...@@ -36,6 +36,7 @@ from ppocr.modeling.architectures import build_model ...@@ -36,6 +36,7 @@ from ppocr.modeling.architectures import build_model
from ppocr.postprocess import build_post_process from ppocr.postprocess import build_post_process
from ppocr.utils.save_load import load_model from ppocr.utils.save_load import load_model
from ppocr.utils.utility import get_image_file_list from ppocr.utils.utility import get_image_file_list
from ppocr.utils.visual import draw_rectangle
import tools.program as program import tools.program as program
import cv2 import cv2
...@@ -111,19 +112,6 @@ def main(config, device, logger, vdl_writer): ...@@ -111,19 +112,6 @@ def main(config, device, logger, vdl_writer):
logger.info("success!") logger.info("success!")
def draw_rectangle(img_path, boxes, use_xywh=False):
img = cv2.imread(img_path)
img_show = img.copy()
for box in boxes.astype(int):
if use_xywh:
x, y, w, h = box
x1, y1, x2, y2 = x - w // 2, y - h // 2, x + w // 2, y + h // 2
else:
x1, y1, x2, y2 = box
cv2.rectangle(img_show, (x1, y1), (x2, y2), (255, 0, 0), 2)
return img_show
if __name__ == '__main__': if __name__ == '__main__':
config, device, logger, vdl_writer = program.preprocess() config, device, logger, vdl_writer = program.preprocess()
main(config, device, logger, vdl_writer) main(config, device, logger, vdl_writer)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册