提交 d850046e 编写于 作者: A andyjpaddle

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into dygraph

...@@ -15,8 +15,8 @@ ...@@ -15,8 +15,8 @@
- **数据简介**:publaynet数据集的训练集合中包含35万张图像,验证集合中包含1.1万张图像。总共包含5个类别,分别是: `text, title, list, table, figure`。部分图像以及标注框可视化如下所示。 - **数据简介**:publaynet数据集的训练集合中包含35万张图像,验证集合中包含1.1万张图像。总共包含5个类别,分别是: `text, title, list, table, figure`。部分图像以及标注框可视化如下所示。
<div align="center"> <div align="center">
<img src="../datasets/publaynet_demo/gt_PMC3724501_00006.jpg" width="500"> <img src="../../datasets/publaynet_demo/gt_PMC3724501_00006.jpg" width="500">
<img src="../datasets/publaynet_demo/gt_PMC5086060_00002.jpg" width="500"> <img src="../../datasets/publaynet_demo/gt_PMC5086060_00002.jpg" width="500">
</div> </div>
- **下载地址**:https://developer.ibm.com/exchanges/data/all/publaynet/ - **下载地址**:https://developer.ibm.com/exchanges/data/all/publaynet/
...@@ -30,8 +30,8 @@ ...@@ -30,8 +30,8 @@
- **数据简介**:CDLA据集的训练集合中包含5000张图像,验证集合中包含1000张图像。总共包含10个类别,分别是: `Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation`。部分图像以及标注框可视化如下所示。 - **数据简介**:CDLA据集的训练集合中包含5000张图像,验证集合中包含1000张图像。总共包含10个类别,分别是: `Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation`。部分图像以及标注框可视化如下所示。
<div align="center"> <div align="center">
<img src="../datasets/CDLA_demo/val_0633.jpg" width="500"> <img src="../../datasets/CDLA_demo/val_0633.jpg" width="500">
<img src="../datasets/CDLA_demo/val_0941.jpg" width="500"> <img src="../../datasets/CDLA_demo/val_0941.jpg" width="500">
</div> </div>
- **下载地址**:https://github.com/buptlihang/CDLA - **下载地址**:https://github.com/buptlihang/CDLA
...@@ -45,8 +45,8 @@ ...@@ -45,8 +45,8 @@
- **数据简介**:TableBank数据集包含Latex(训练集187199张,验证集7265张,测试集5719张)与Word(训练集73383张,验证集2735张,测试集2281张)两种类别的文档。仅包含`Table` 1个类别。部分图像以及标注框可视化如下所示。 - **数据简介**:TableBank数据集包含Latex(训练集187199张,验证集7265张,测试集5719张)与Word(训练集73383张,验证集2735张,测试集2281张)两种类别的文档。仅包含`Table` 1个类别。部分图像以及标注框可视化如下所示。
<div align="center"> <div align="center">
<img src="../datasets/tablebank_demo/004.png" height="700"> <img src="../../datasets/tablebank_demo/004.png" height="700">
<img src="../datasets/tablebank_demo/005.png" height="700"> <img src="../../datasets/tablebank_demo/005.png" height="700">
</div> </div>
- **下载地址**:https://doc-analysis.github.io/tablebank-page/index.html - **下载地址**:https://doc-analysis.github.io/tablebank-page/index.html
......
...@@ -176,11 +176,6 @@ class Kie_backbone(nn.Layer): ...@@ -176,11 +176,6 @@ class Kie_backbone(nn.Layer):
x = self.img_feat(img) x = self.img_feat(img)
boxes, rois_num = self.bbox2roi(gt_bboxes) boxes, rois_num = self.bbox2roi(gt_bboxes)
feats = paddle.vision.ops.roi_align( feats = paddle.vision.ops.roi_align(
x, x, boxes, spatial_scale=1.0, output_size=7, boxes_num=rois_num)
boxes,
spatial_scale=1.0,
pooled_height=7,
pooled_width=7,
rois_num=rois_num)
feats = self.maxpool(feats).squeeze(-1).squeeze(-1) feats = self.maxpool(feats).squeeze(-1).squeeze(-1)
return [relations, texts, feats] return [relations, texts, feats]
...@@ -76,7 +76,7 @@ def export_single_model(model, arch_config, save_path, logger, quanter=None): ...@@ -76,7 +76,7 @@ def export_single_model(model, arch_config, save_path, logger, quanter=None):
else: else:
infer_shape = [3, -1, -1] infer_shape = [3, -1, -1]
if arch_config["model_type"] == "rec": if arch_config["model_type"] == "rec":
infer_shape = [3, 32, -1] # for rec model, H must be 32 infer_shape = [3, 48, -1] # for rec model, H must be 32
if "Transform" in arch_config and arch_config[ if "Transform" in arch_config and arch_config[
"Transform"] is not None and arch_config["Transform"][ "Transform"] is not None and arch_config["Transform"][
"name"] == "TPS": "name"] == "TPS":
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册