diff --git a/doc/datasets/funsd_demo/gt_train_00040534.jpg b/doc/datasets/funsd_demo/gt_train_00040534.jpg new file mode 100644 index 0000000000000000000000000000000000000000..9f7cf4d4977689b73e2ca91cbe9c877bb8f0c7ff Binary files /dev/null and b/doc/datasets/funsd_demo/gt_train_00040534.jpg differ diff --git a/doc/datasets/funsd_demo/gt_train_00070353.jpg b/doc/datasets/funsd_demo/gt_train_00070353.jpg new file mode 100644 index 0000000000000000000000000000000000000000..36d3345e5ec4c262764e63a972aaa82e98877681 Binary files /dev/null and b/doc/datasets/funsd_demo/gt_train_00070353.jpg differ diff --git a/doc/datasets/xfund_demo/gt_zh_train_0.jpg b/doc/datasets/xfund_demo/gt_zh_train_0.jpg new file mode 100644 index 0000000000000000000000000000000000000000..6fdaf12fa1d79e6ea9029d665ab7488223459436 Binary files /dev/null and b/doc/datasets/xfund_demo/gt_zh_train_0.jpg differ diff --git a/doc/datasets/xfund_demo/gt_zh_train_1.jpg b/doc/datasets/xfund_demo/gt_zh_train_1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..6a1e53a3ba09b6f84809cfd10a15c42f42b9a163 Binary files /dev/null and b/doc/datasets/xfund_demo/gt_zh_train_1.jpg differ diff --git a/doc/doc_ch/docvqa_datasets.md b/doc/doc_ch/docvqa_datasets.md new file mode 100644 index 0000000000000000000000000000000000000000..8648329ca2c551babf8249e354258f4fd009a5a2 --- /dev/null +++ b/doc/doc_ch/docvqa_datasets.md @@ -0,0 +1,27 @@ +## DocVQA数据集 +这里整理了常见的DocVQA数据集,持续更新中,欢迎各位小伙伴贡献数据集~ +- [FUNSD数据集](#funsd) +- [XFUND数据集](#xfund) + + +#### 1、FUNSD数据集 +- **数据来源**:https://guillaumejaume.github.io/FUNSD/ +- **数据简介**:FUNSD数据集是一个用于表单理解的数据集,它包含199张真实的、完全标注的扫描版图片,类型包括市场报告、广告以及学术报告等,并分为149张训练集以及50张测试集。FUNSD数据集适用于多种类型的DocVQA任务,如字段级实体分类、字段级实体连接等。部分图像以及标注框可视化如下所示: +