diff --git a/doc/ic15_location_download.png b/doc/datasets/ic15_location_download.png similarity index 100% rename from doc/ic15_location_download.png rename to doc/datasets/ic15_location_download.png diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md index 57bfdc01e28042e70e42f0dfecb6f8c81d92d8f1..88cb197d5a704e66c96d7b29a11ea5562cd9e14d 100644 --- a/doc/doc_ch/detection.md +++ b/doc/doc_ch/detection.md @@ -19,15 +19,16 @@ ## 1.1 数据准备 -icdar2015数据集可以从[官网](https://rrc.cvc.uab.es/?ch=4&com=downloads)下载到,首次下载需注册。 +icdar2015 TextLocalization数据集是文本检测的数据集,包含1000张训练图像和500张测试图像。 +icdar2015数据集可以从[官网](https://rrc.cvc.uab.es/?ch=4&com=downloads)下载到,首次下载需注册。 注册完成登陆后,下载下图中红色框标出的部分,其中, `Training Set Images`下载的内容保存为`icdar_c4_train_imgs`文件夹下,`Test Set Images` 下载的内容保存为`ch4_test_images`文件夹下

- +

-将下载到的数据集解压到工作目录下,假设解压在 PaddleOCR/train_data/ 下。另外,PaddleOCR将零散的标注文件整理成单独的标注文件 +将下载到的数据集解压到工作目录下,假设解压在 PaddleOCR/train_data/下。另外,PaddleOCR将零散的标注文件整理成单独的标注文件 ,您可以通过wget的方式进行下载。 ```shell # 在PaddleOCR路径下 diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md index 8f12d42fe798de7d330f1d3ef1950325887525cb..03b88179ba983ff247dbe05ac7b139f4c719385d 100644 --- a/doc/doc_en/detection_en.md +++ b/doc/doc_en/detection_en.md @@ -18,13 +18,14 @@ This section uses the icdar2015 dataset as an example to introduce the training, evaluation, and testing of the detection model in PaddleOCR. ## 1.1 DATA PREPARATION -The icdar2015 dataset can be obtained from [official website](https://rrc.cvc.uab.es/?ch=4&com=downloads). Registration is required for downloading. + +The icdar2015 dataset contains train set which has 1000 images obtained with wearable cameras and test set which has 500 images obtained with wearable cameras. The icdar2015 can be obtained from [official website](https://rrc.cvc.uab.es/?ch=4&com=downloads). Registration is required for downloading. After registering and logging in, download the part marked in the red box in the figure below. And, the content downloaded by `Training Set Images` should be saved as the folder `icdar_c4_train_imgs`, and the content downloaded by `Test Set Images` is saved as the folder `ch4_test_images`

- +

Decompress the downloaded dataset to the working directory, assuming it is decompressed under PaddleOCR/train_data/. In addition, PaddleOCR organizes many scattered annotation files into two separate annotation files for train and test respectively, which can be downloaded by wget: