提交 3af943f3 编写于 作者: L LDOUBLEV

fix e2e

上级 0a276ad4
...@@ -69,7 +69,7 @@ Metric: ...@@ -69,7 +69,7 @@ Metric:
Train: Train:
dataset: dataset:
name: PGDataSet name: PGDataSet
label_file_list: [.././train_data/total_text/train/total_text.txt] label_file_list: [.././train_data/total_text/train/]
ratio_list: [1.0] ratio_list: [1.0]
data_format: icdar #two data format: icdar/textnet data_format: icdar #two data format: icdar/textnet
transforms: transforms:
...@@ -93,7 +93,7 @@ Eval: ...@@ -93,7 +93,7 @@ Eval:
dataset: dataset:
name: PGDataSet name: PGDataSet
data_dir: ./train_data/ data_dir: ./train_data/
label_file_list: [./train_data/total_text/test/total_text.txt] label_file_list: [./train_data/total_text/test/]
transforms: transforms:
- DecodeImage: # load image - DecodeImage: # load image
img_mode: RGB img_mode: RGB
...@@ -113,4 +113,4 @@ Eval: ...@@ -113,4 +113,4 @@ Eval:
shuffle: False shuffle: False
drop_last: False drop_last: False
batch_size_per_card: 1 # must be 1 batch_size_per_card: 1 # must be 1
num_workers: 2 num_workers: 2
\ No newline at end of file
...@@ -87,15 +87,15 @@ python3 tools/infer/predict_e2e.py --e2e_algorithm="PGNet" --image_dir="./doc/im ...@@ -87,15 +87,15 @@ python3 tools/infer/predict_e2e.py --e2e_algorithm="PGNet" --image_dir="./doc/im
``` ```
/PaddleOCR/train_data/total_text/train/ /PaddleOCR/train_data/total_text/train/
|- rgb/ # total_text数据集的训练数据 |- rgb/ # total_text数据集的训练数据
|- img11.jpg |- gt_0.png
| ... | ...
|- train.txt # total_text数据集的训练标注 |- total_text.txt # total_text数据集的训练标注
``` ```
total_text.txt标注文件格式如下,文件名和标注信息中间用"\t"分隔: total_text.txt标注文件格式如下,文件名和标注信息中间用"\t"分隔:
``` ```
" 图像文件名 json.dumps编码的图像标注信息" " 图像文件名 json.dumps编码的图像标注信息"
rgb/img11.jpg [{"transcription": "ASRAMA", "points": [[214.0, 325.0], [235.0, 308.0], [259.0, 296.0], [286.0, 291.0], [313.0, 295.0], [338.0, 305.0], [362.0, 320.0], [349.0, 347.0], [330.0, 337.0], [310.0, 329.0], [290.0, 324.0], [269.0, 328.0], [249.0, 336.0], [231.0, 346.0]]}, {...}] rgb/gt_0.png [{"transcription": "EST", "points": [[1004.0,689.0],[1019.0,698.0],[1034.0,708.0],[1049.0,718.0],[1064.0,728.0],[1079.0,738.0],[1095.0,748.0],[1094.0,774.0],[1079.0,765.0],[1065.0,756.0],[1050.0,747.0],[1036.0,738.0],[1021.0,729.0],[1007.0,721.0]]}, {...}]
``` ```
json.dumps编码前的图像标注信息是包含多个字典的list,字典中的 `points` 表示文本框的四个点的坐标(x, y),从左上角的点开始顺时针排列。 json.dumps编码前的图像标注信息是包含多个字典的list,字典中的 `points` 表示文本框的四个点的坐标(x, y),从左上角的点开始顺时针排列。
`transcription` 表示当前文本框的文字,**当其内容为“###”时,表示该文本框无效,在训练时会跳过。** `transcription` 表示当前文本框的文字,**当其内容为“###”时,表示该文本框无效,在训练时会跳过。**
......
...@@ -80,15 +80,15 @@ Download and unzip [totaltext](https://github.com/cs-chan/Total-Text-Dataset/blo ...@@ -80,15 +80,15 @@ Download and unzip [totaltext](https://github.com/cs-chan/Total-Text-Dataset/blo
``` ```
/PaddleOCR/train_data/total_text/train/ /PaddleOCR/train_data/total_text/train/
|- rgb/ # total_text training data of dataset |- rgb/ # total_text training data of dataset
|- img11.png |- gt_0.png
| ... | ...
|- train.txt # total_text training annotation of dataset |- total_text.txt # total_text training annotation of dataset
``` ```
total_text.txt: the format of dimension file is as follows,the file name and annotation information are separated by "\t": total_text.txt: the format of dimension file is as follows,the file name and annotation information are separated by "\t":
``` ```
" Image file name Image annotation information encoded by json.dumps" " Image file name Image annotation information encoded by json.dumps"
rgb/img11.jpg [{"transcription": "ASRAMA", "points": [[214.0, 325.0], [235.0, 308.0], [259.0, 296.0], [286.0, 291.0], [313.0, 295.0], [338.0, 305.0], [362.0, 320.0], [349.0, 347.0], [330.0, 337.0], [310.0, 329.0], [290.0, 324.0], [269.0, 328.0], [249.0, 336.0], [231.0, 346.0]]}, {...}] rgb/gt_0.png [{"transcription": "EST", "points": [[1004.0,689.0],[1019.0,698.0],[1034.0,708.0],[1049.0,718.0],[1064.0,728.0],[1079.0,738.0],[1095.0,748.0],[1094.0,774.0],[1079.0,765.0],[1065.0,756.0],[1050.0,747.0],[1036.0,738.0],[1021.0,729.0],[1007.0,721.0]]}, {...}]
``` ```
The image annotation after **json.dumps()** encoding is a list containing multiple dictionaries. The image annotation after **json.dumps()** encoding is a list containing multiple dictionaries.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册