[cherry-pick] Fix some docs. (#2705)

* Fix some docs. (#2698) * Unify COCO and VOC * Change PASCAL to Pascal * Unify dataset/coco

[cherry-pick] Fix some docs. (#2705)
* Fix some docs. (#2698) * Unify COCO and VOC * Change PASCAL to Pascal * Unify dataset/coco
520b9610 · qingqing01 · GitHub · 4ae0c089 · 520b9610 · 520b9610
8 changed file
--- a/PaddleCV/PaddleDetection/configs/retinanet_r101_fpn_1x.yml
+++ b/PaddleCV/PaddleDetection/configs/retinanet_r101_fpn_1x.yml
@@ -79,7 +79,7 @@ FasterRCNNTrainFeed:
  - !PadBatch
    pad_to_stride: 128
  dataset:
-    dataset_dir: data/coco
+    dataset_dir: dataset/coco
    annotation: annotations/instances_train2017.json
    image_dir: train2017
  num_workers: 2
@@ -90,7 +90,7 @@ FasterRCNNEvalFeed:
  - !PadBatch
    pad_to_stride: 128
  dataset:
-    dataset_dir: data/coco
+    dataset_dir: dataset/coco
    annotation: annotations/instances_val2017.json
    image_dir: val2017
  num_workers: 2

--- a/PaddleCV/PaddleDetection/demo/mask_rcnn_demo.ipynb
+++ b/PaddleCV/PaddleDetection/demo/mask_rcnn_demo.ipynb
@@ -28,7 +28,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "/home/yang/models/PaddleCV/object_detection\n"
+      "/home/yang/models/PaddleCV/PaddleDetection\n"
     ]
    }
   ],
--- a/PaddleCV/PaddleDetection/docs/CONFIG.md
+++ b/PaddleCV/PaddleDetection/docs/CONFIG.md
@@ -111,13 +111,13 @@ The corresponding(generated) YAML snippet is as follows, note this is the config

 ```yaml
 RPNHead:
-  test_prop:
+  test_proposal:
    eta: 1.0
    min_size: 0.1
    nms_thresh: 0.5
    post_nms_top_n: 1000
    pre_nms_top_n: 6000
-  train_prop:
+  train_proposal:
    eta: 1.0
    min_size: 0.1
    nms_thresh: 0.5

--- a/PaddleCV/PaddleDetection/docs/CONFIG_cn.md
+++ b/PaddleCV/PaddleDetection/docs/CONFIG_cn.md
@@ -103,13 +103,13 @@ class RPNHead(object):

 ```yaml
 RPNHead:
-  test_prop:
+  test_proposal:
    eta: 1.0
    min_size: 0.1
    nms_thresh: 0.5
    post_nms_top_n: 1000
    pre_nms_top_n: 6000
-  train_prop:
+  train_proposal:
    eta: 1.0
    min_size: 0.1
    nms_thresh: 0.5

--- a/PaddleCV/PaddleDetection/docs/DATA.md
+++ b/PaddleCV/PaddleDetection/docs/DATA.md
@@ -10,23 +10,23 @@ im_info, im_id, gt_bbox, gt_class, is_crowd), (...)]`.
 The data pipeline consists of four sub-systems: data parsing, image
 pre-processing, data conversion and data feeding APIs.

-Data samples are collected to form `dataset.Dataset`s, usually 3 sets are
+Data samples are collected to form `data.Dataset`s, usually 3 sets are
 needed for training, validation, and testing respectively.

-First, `dataset.source` loads the data files into memory, then
-`dataset.transform` processes them, and lastly, the batched samples
-are fetched by `dataset.Reader`.
+First, `data.source` loads the data files into memory, then
+`data.transform` processes them, and lastly, the batched samples
+are fetched by `data.Reader`.

 Sub-systems details:
 1. Data parsing
-Parses various data sources and creates `dataset.Dataset` instances. Currently,
+Parses various data sources and creates `data.Dataset` instances. Currently,
 following data sources are supported:

 - COCO data source
 Loads `COCO` type datasets with directory structures like this:

  ```
-  data/coco/
+  dataset/coco/
  ├── annotations
  │   ├── instances_train2017.json
  │   ├── instances_val2017.json
@@ -104,19 +104,19 @@ python ./tools/generate_data_for_training.py
 ```

 2. Image preprocessing
-the `dataset.transform.operator` module provides operations such as image
+the `data.transform.operator` module provides operations such as image
 decoding, expanding, cropping, etc. Multiple operators are combined to form
 larger processing pipelines.

 3. Data transformer
-Transform a `dataset.Dataset` to achieve various desired effects, Notably: the
-`dataset.transform.paralle_map` transformer accelerates image processing with
+Transform a `data.Dataset` to achieve various desired effects, Notably: the
+`data.transform.paralle_map` transformer accelerates image processing with
 multi-threads or multi-processes. More transformers can be found in
-`dataset.transform.transformer`.
+`data.transform.transformer`.

 4. Data feeding apis
-To facilitate data pipeline building, we combine multiple `dataset.Dataset` to
-form a `dataset.Reader` which can provide data for training, validation and
+To facilitate data pipeline building, we combine multiple `data.Dataset` to
+form a `data.Reader` which can provide data for training, validation and
 testing respectively. Users can simply call `Reader.[train|eval|infer]` to get
 the corresponding data stream. Many aspect of the `Reader`, such as storage
 location, preprocessing pipeline, acceleration mode can be configured with yaml
@@ -126,13 +126,13 @@ The main APIs are as follows:

 1. Data parsing

- - `source/coco_loader.py`: COCO dataset parser. [source](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/coco_loader.py)
- - `source/voc_loader.py`: Pascal VOC dataset parser. [source](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/voc_loader.py)
+ - `source/coco_loader.py`: COCO dataset parser. [source](../ppdet/data/source/coco_loader.py)
+ - `source/voc_loader.py`: Pascal VOC dataset parser. [source](../ppdet/data/source/voc_loader.py)
 [Note] To use a non-default label list for VOC datasets, a `label_list.txt`
 file is needed, one can use the provided label list
 (`data/pascalvoc/ImageSets/Main/label_list.txt`) or generate a custom one (with `tools/generate_data_for_training.py`). Also, `use_default_label` option should
 be set to `false` in the configuration file
- - `source/loader.py`: Roidb dataset parser. [source](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/loader.py)
+ - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py)

 2. Operator
 `transform/operators.py`: Contains a variety of data enhancement methods, including:
@@ -167,7 +167,7 @@ The main APIs are as follows:

 #### Canned Datasets

-Preset for common datasets, e.g., `MS-COCO` and `Pascal Voc` are included. In
+Preset for common datasets, e.g., `COCO` and `Pascal Voc` are included. In
 most cases, user can simply use these canned dataset as is. Moreover, the
 whole data pipeline is fully customizable through the yaml configuration files.


--- a/PaddleCV/PaddleDetection/docs/DATA_cn.md
+++ b/PaddleCV/PaddleDetection/docs/DATA_cn.md
@@ -4,17 +4,17 @@
 ### 实现
 该模块内部可分为4个子功能：数据解析、图片预处理、数据转换和数据获取接口。

-我们采用`dataset.Dataset`表示一份数据，比如`COCO`数据包含3份数据，分别用于训练、验证和测试。原始数据存储与文件中，通过`dataset.source`加载到内存，然后使用`dataset.transform`对数据进行处理转换，最终通过`dataset.Reader`的接口可以获得用于训练、验证和测试的batch数据。
+我们采用`data.Dataset`表示一份数据，比如`COCO`数据包含3份数据，分别用于训练、验证和测试。原始数据存储与文件中，通过`data.source`加载到内存，然后使用`data.transform`对数据进行处理转换，最终通过`data.Reader`的接口可以获得用于训练、验证和测试的batch数据。

 子功能介绍：

 1. 数据解析  
-     数据解析得到的是`dataset.Dataset`,实现逻辑位于`dataset.source`中。通过它可以实现解析不同格式的数据集，已支持的数据源包括：
+     数据解析得到的是`data.Dataset`,实现逻辑位于`data.source`中。通过它可以实现解析不同格式的数据集，已支持的数据源包括：
 - COCO数据源
     该数据集目前分为COCO2012和COCO2017，主要由json文件和image文件组成，其组织结构如下所示：

  ```
-  data/coco/
+  dataset/coco/
  ├── annotations
  │   ├── instances_train2014.json
  │   ├── instances_train2017.json
@@ -83,7 +83,7 @@

 ```
 我们在`./tools/`中提供了一个生成roidb数据集的代码，可以通过下面命令实现该功能。
-```python
+```
 # --type: 原始数据集的类别（只能是xml或者json）
 # --annotation: 一个包含所需标注文件名的文件的路径
 # --save-dir: 保存路径
@@ -95,13 +95,13 @@ python ./tools/generate_data_for_training.py
            --samples=-1
 ```
 2. 图片预处理  
-    图片预处理通过包括图片解码、缩放、裁剪等操作，我们采用`dataset.transform.operator`算子的方式来统一实现，这样能方便扩展。此外，多个算子还可以组合形成复杂的处理流程, 并被`dataset.transformer`中的转换器使用，比如多线程完成一个复杂的预处理流程。
+    图片预处理通过包括图片解码、缩放、裁剪等操作，我们采用`data.transform.operator`算子的方式来统一实现，这样能方便扩展。此外，多个算子还可以组合形成复杂的处理流程, 并被`data.transformer`中的转换器使用，比如多线程完成一个复杂的预处理流程。

 3. 数据转换器  
-    数据转换器的功能是完成对某个`dataset.Dataset`进行转换处理，从而得到一个新的`dataset.Dataset`。我们采用装饰器模式实现各种不同的`dataset.transform.transformer`。比如用于多进程预处理的`dataset.transform.paralle_map`转换器。
+    数据转换器的功能是完成对某个`data.Dataset`进行转换处理，从而得到一个新的`data.Dataset`。我们采用装饰器模式实现各种不同的`data.transform.transformer`。比如用于多进程预处理的`dataset.transform.paralle_map`转换器。

 4. 数据获取接口  
-     为方便训练时的数据获取，我们将多个`dataset.Dataset`组合在一起构成一个`dataset.Reader`为用户提供数据，用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。
+     为方便训练时的数据获取，我们将多个`data.Dataset`组合在一起构成一个`data.Reader`为用户提供数据，用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。

 主要的APIs如下：

@@ -110,10 +110,10 @@ python ./tools/generate_data_for_training.py

 1. 数据解析  

- - `source/coco_loader.py`：用于解析COCO数据集。[详见代码](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/coco_loader.py)
- - `source/voc_loader.py`：用于解析Pascal VOC数据集。[详见代码](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/voc_loader.py)  
+ - `source/coco_loader.py`：用于解析COCO数据集。[详见代码](../ppdet/data/source/coco_loader.py)
+ - `source/voc_loader.py`：用于解析Pascal VOC数据集。[详见代码](../ppdet/data/source/voc_loader.py)  
 [注意]在使用VOC数据集时，若不使用默认的label列表，则需要先使用`tools/generate_data_for_training.py`生成`label_list.txt`（使用方式与数据解析中的roidb数据集获取过程一致），或提供`label_list.txt`放置于`data/pascalvoc/ImageSets/Main`中；同时在配置文件中设置参数`use_default_label`为`true`。
- - `source/loader.py`：用于解析Roidb数据集。[详见代码](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/ppdet/data/source/loader.py)
+ - `source/loader.py`：用于解析Roidb数据集。[详见代码](../ppdet/data/source/loader.py)

 2. 算子  
 `transform/operators.py`：包含多种数据增强方式，主要包括：  
@@ -164,7 +164,7 @@ coco = Reader(ccfg.DATA, ccfg.TRANSFORM, maxiter=-1)
 #### 如何使用自定义数据集？

 - 选择1：将数据集转换为VOC格式或者COCO格式。
-```python
+```
 # 在./tools/中提供了labelme2coco.py用于将labelme标注的数据集转换为COCO数据集
 python ./tools/labelme2coco.py --json_input_dir ./labelme_annos/
                                --image_input_dir ./labelme_imgs/

--- a/PaddleCV/PaddleDetection/docs/INSTALL.md
+++ b/PaddleCV/PaddleDetection/docs/INSTALL.md
@@ -13,7 +13,7 @@
 ## Introduction

 This document covers how to install PaddleDetection, its dependencies
-(including PaddlePaddle), together with COCO and PASCAL VOC dataset.
+(including PaddlePaddle), together with COCO and Pascal VOC dataset.

 For general information about PaddleDetection, please see [README.md](../README.md).

@@ -68,12 +68,12 @@ with the following commands:
 ```
 cd <path/to/clone/models>
 git clone https://github.com/PaddlePaddle/models
-cd models/PaddleCV/object_detection
+cd models/PaddleCV/PaddleDetection
 ```

 **Install Python dependencies:**

-Required python packages are specified in [requirements.txt](./requirements.txt), and can be installed with:
+Required python packages are specified in [requirements.txt](../requirements.txt), and can be installed with:

 ```
 pip install -r requirements.txt
@@ -89,31 +89,31 @@ python ppdet/modeling/tests/test_architectures.py

 ## Datasets

-PaddleDetection includes support for [MSCOCO](http://cocodataset.org) and [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/) by default, please follow these instructions to set up the dataset.
+PaddleDetection includes support for [COCO](http://cocodataset.org) and [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) by default, please follow these instructions to set up the dataset.

 **Create symlinks for local datasets:**

-Default dataset path in config files is `data/coco` and `data/voc`, if the
+Default dataset path in config files is `dataset/coco` and `dataset/voc`, if the
 datasets are already available on disk, you can simply create symlinks to
 their directories:

 ```
-ln -sf <path/to/coco> <path/to/paddle_detection>/data/coco
-ln -sf <path/to/voc> <path/to/paddle_detection>/data/voc
+ln -sf <path/to/coco> <path/to/paddle_detection>/dataset/coco
+ln -sf <path/to/voc> <path/to/paddle_detection>/dataset/voc
 ```

 **Download datasets manually:**

 On the other hand, to download the datasets, run the following commands:

- MS-COCO
+- COCO

 ```
 cd dataset/coco
 ./download.sh
 ```

- PASCAL VOC
+- Pascal VOC

 ```
 cd dataset/voc
@@ -123,8 +123,8 @@ cd dataset/voc
 **Download datasets automatically:**

 If a training session is started but the dataset is not setup properly (e.g,
-not found in `data/coc` or `data/voc`), PaddleDetection can automatically
-download them from [MSCOCO-2017](http://images.cocodataset.org) and
+not found in `dataset/coco` or `dataset/voc`), PaddleDetection can automatically
+download them from [COCO-2017](http://images.cocodataset.org) and
 [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC), the decompressed datasets
 will be cached in `~/.cache/paddle/dataset/` and can be discovered automatically
 subsequently.

--- a/PaddleCV/PaddleDetection/docs/MODEL_ZOO.md
+++ b/PaddleCV/PaddleDetection/docs/MODEL_ZOO.md
@@ -77,7 +77,7 @@ randomly color distortion, randomly cropping, randomly expansion, randomly inter

 **Notes:** In RetinaNet, the base LR is changed to 0.01 for minibatch size 16.

-### SSD on PascalVOC
+### SSD on Pascal VOC

 | Backbone     | Size | Image/gpu | Lr schd | Box AP | Download  |
 | :----------- | :--: | :-----: | :-----: | :----: | :-------: |