Add Face detection doc (#3587)

* add face doc * update face detection readme and add demo jpg

Add Face detection doc (#3587)
* add face doc * update face detection readme and add demo jpg
28c90f86 · Guanghua Yu · qingqing01 · 8116ac6f · 28c90f86 · 28c90f86
7 changed file
--- a/configs/face_detection/README.md
+++ b/configs/face_detection/README.md
+English | [简体中文](README_cn.md)
+# FaceDetection
+The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
+including cutting-edge and classic models.
+<div align="center">
+  <img src="../../demo/output/12_Group_Group_12_Group_Group_12_935.jpg" />
+</div>
+## Data Pipline
+We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
+and testing of the model, the official website gives detailed data introduction.
+- WIDER Face data source:  
+Loads `wider_face` type dataset with directory structures like this:
+  ```
+  dataset/wider_face/
+  ├── wider_face_split
+  │   ├── wider_face_train_bbx_gt.txt
+  │   ├── wider_face_val_bbx_gt.txt
+  ├── WIDER_train
+  │   ├── images
+  │   │   ├── 0--Parade
+  │   │   │   ├── 0_Parade_marchingband_1_100.jpg
+  │   │   │   ├── 0_Parade_marchingband_1_381.jpg
+  │   │   │   │   ...
+  │   │   ├── 10--People_Marching
+  │   │   │   ...
+  ├── WIDER_val
+  │   ├── images
+  │   │   ├── 0--Parade
+  │   │   │   ├── 0_Parade_marchingband_1_1004.jpg
+  │   │   │   ├── 0_Parade_marchingband_1_1045.jpg
+  │   │   │   │   ...
+  │   │   ├── 10--People_Marching
+  │   │   │   ...
+  ```
+- Download dataset manually:  
+On the other hand, to download the WIDER FACE dataset, run the following commands:
+```
+cd dataset/wider_face && ./download.sh
+```
+- Download dataset automatically:
+If a training session is started but the dataset is not setup properly
+(e.g, not found in dataset/wider_face), PaddleDetection can automatically
+download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
+the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
+automatically subsequently.
+### Data Augmentation
+- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
+greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
+according to the randomly selected face height and width, and judge the value of `v` in which interval of
+ `[16,32,64,128]`. Assuming `v=45` && `32<v<64`, and any value of `[16,32,64]` is selected with a probability
+ of uniform distribution. If `64` is selected, the face's interval is selected in `[64 / 2, min(v * 2, 64 * 2)]`.
+- **Other methods:** Including `RandomDistort`,`ExpandImage`,`RandomInterpImage`,`RandomFlipImage` etc.
+Please refer to [DATA.md](../../docs/DATA.md#APIs) for details.
+##  Benchmark and Model Zoo
+Supported architectures is shown in the below table, please refer to
+[Algorithm Description](#Algorithm-Description) for details of the algorithm.
+|                          | Original | Lite <sup>[1](#lite)</sup> | NAS <sup>[2](#nas)</sup> |
+|:------------------------:|:--------:|:--------------------------:|:------------------------:|
+| [BlazeFace](#BlazeFace)  | ✓        |                          ✓ | ✓                        |
+| [FaceBoxes](#FaceBoxes)  | ✓        |                          ✓ | x                        |
+<a name="lite">[1]</a> `Lite` edition means reduces the number of network layers and channels.  
+<a name="nas">[2]</a> `NAS` edition means use `Neural Architecture Search` algorithm to
+optimized network structure.
+**Todo List:**
+- [ ] HamBox
+- [ ] Pyramidbox
+### Model Zoo
+#### mAP in WIDER FACE
+| Architecture | Type     | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set |
+|:------------:|:--------:|:----:|:-------:|:-------:|:--------:|:----------:|:--------:|
+| BlazeFace    | Original | 640  |    8    | 32w     | **0.915**    | **0.892**      | **0.797**    |
+| BlazeFace    | Lite     | 640  |    8    | 32w     | 0.909    | 0.885      | 0.781    |
+| BlazeFace    | NAS      | 640  |    8    | 32w     | 0.837    | 0.807      | 0.658    |
+| FaceBoxes    | Original | 640  |    8    | 32w     | 0.875    | 0.848      | 0.568    |
+| FaceBoxes    | Lite     | 640  |    8    | 32w     | 0.898    | 0.872      | 0.752    |
+**NOTES:**  
+- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
+For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
+- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml)
+configs file and set `lite_edition: true`.
+#### mAP in FDDB
+| Architecture | Type     | Size | DistROC | ContROC |
+|:------------:|:--------:|:----:|:-------:|:-------:|
+| BlazeFace    | Original | 640  | **0.992**   | **0.762**   |
+| BlazeFace    | Lite     | 640  | 0.990   | 0.756   |
+| BlazeFace    | NAS      | 640  | 0.981   | 0.741   |
+| FaceBoxes    | Original | 640  | 0.985   | 0.731   |
+| FaceBoxes    | Lite     | 640  | 0.987   | 0.741   |
+**NOTES:**  
+- Get mAP by multi-scale evaluation on the FDDB dataset.
+For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
+#### Infer Time and Model Size comparison  
+| Architecture | Type     | Size | P4 (ms)   | CPU (ms) | ARM (ms)   | File size (MB) | Flops     |
+|:------------:|:--------:|:----:|:---------:|:--------:|:----------:|:--------------:|:---------:|
+| BlazeFace    | Original | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 128  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 128  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | Original | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 320  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 320  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | Original | 640  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 640  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 640  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 640  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 640  | -         | -        | -          | -              | -         |
+**NOTES:**  
+- CPU: i5-7360U @ 2.30GHz. Single core and single thread.
+## Get Started
+`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
+- **NOTES:**  Currently we do not support evaluation in training.
+### Evaluation
+```
+export CUDA_VISIBLE_DEVICES=0
+export PYTHONPATH=$PYTHONPATH:.
+python tools/face_eval.py -c configs/face_detection/blazeface.yml
+```
+- Optional arguments
+- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`.
+- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
+- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
+After the evaluation is completed, the test result in txt format will be generated in `output/pred`,
+and then mAP will be calculated according to different data sets:
+#### Evaluate on the WIDER FACE
+- Download the official evaluation script to evaluate the AP metrics:
+```
+wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
+unzip eval_tools.zip && rm -f eval_tools.zip
+```
+- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`:
+```
+# Modify the folder name where the result is stored.
+pred_dir = './pred';  
+# Modify the name of the curve to be drawn
+legend_name = 'Fluid-BlazeFace';
+```
+- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows:
+```
+matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
+```
+#### Evaluate on the FDDB
+- Download the official dataset and evaluation script to evaluate the ROC metrics:
+```
+#external link to the Faces in the Wild data set
+wget http://tamaraberg.com/faceDataset/originalPics.tar.gz
+#The annotations are split into ten folds. See README for details.
+wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz
+#information on directory structure and file formats
+wget http://vis-www.cs.umass.edu/fddb/README.txt
+```
+- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/)  
+If the utility 'pkg-config' is not available for your operating system,
+edit the Makefile to manually specify the OpenCV flags as following:
+```
+INCS = -I/usr/local/include/opencv
+LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml
+```
+- Compile FDDB evaluation code: execute `make` in evaluation folder.
+- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows:
+```
+cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt`
+```
+- Evaluation
+Finally evaluation command is:
+```
+./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \
+           -d DETECTION_RESULT.txt -f 0 \
+           -i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \
+           -r ./OUTPUT_DIR -z .jpg
+```
+**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`.
+## Algorithm Description
+### BlazeFace
+**Introduction:**  
+[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model.
+It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed
+of 200-1000+ FPS on flagship devices.
+**Particularity:**  
+- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
+- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
+- Replace the non-maximum suppression algorithm with a blending strategy that estimates the
+regression parameters of a bounding box as a weighted mean between the overlapping predictions.
+**Edition information:**
+- Original: Reference original paper reproduction.
+- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
+- NAS: use `Neural Architecture Search` algorithm to optimized network structure,
+less network layer and conv channel number than `Lite`.
+### FaceBoxes
+**Introduction:**
+[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
+with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
+both speed and accuracy. This paper is published by IJCB(2017).
+**Particularity:**
+- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640,
+including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities
+are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
+- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
+- Use density prior box to improve detection accuracy.
+**Edition information:**
+- Original: Reference original paper reproduction.
+- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU.
+Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution.
+The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.
+## Contributing
+Contributions are highly welcomed and we would really appreciate your feedback!!
--- a/configs/face_detection/blazeface.yml
+++ b/configs/face_detection/blazeface.yml
@@ -89,7 +89,7 @@ SSDEvalFeed:
  fields: ['image', 'im_id', 'gt_box']
  dataset:
    dataset_dir: dataset/wider_face
-    annotation: annotFile.txt #wider_face_split/wider_face_val_bbx_gt.txt   
+    annotation: wider_face_split/wider_face_val_bbx_gt.txt   
    image_dir: WIDER_val/images
  drop_last: false
  image_shape: [3, 640, 640]

--- a/dataset/wider_face/download.sh
+++ b/dataset/wider_face/download.sh
+# All rights `PaddleDetection` reserved
+# References:
+#   @inproceedings{yang2016wider,
+#   Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},
+#   Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+#   Title = {WIDER FACE: A Face Detection Benchmark},
+#   Year = {2016}}
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+# Download the data.
+echo "Downloading..."
+wget https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip
+wget https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip
+wget https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip
+# Extract the data.
+echo "Extracting..."
+unzip WIDER_train.zip
+unzip WIDER_val.zip
+unzip wider_face_split.zip
--- a/demo/output/12_Group_Group_12_Group_Group_12_935.jpg
+++ b/demo/output/12_Group_Group_12_Group_Group_12_935.jpg
--- a/docs/DATA.md
+++ b/docs/DATA.md
@@ -126,6 +126,8 @@ the corresponding data stream. Many aspect of the `Reader`, such as storage
 location, preprocessing pipeline, acceleration mode can be configured with yaml
 files.
+### APIs
 The main APIs are as follows:
 1. Data parsing
@@ -139,7 +141,7 @@ The main APIs are as follows:
 - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py)
 2. Operator
- `transform/operators.py`: Contains a variety of data enhancement methods, including:
+ `transform/operators.py`: Contains a variety of data augmentation methods, including:
 - `DecodeImage`: Read images in RGB format.
 - `RandomFlipImage`: Horizontal flip.
 - `RandomDistort`: Distort brightness, contrast, saturation, and hue.
@@ -150,7 +152,7 @@ The main APIs are as follows:
 - `NormalizeImage`: Normalize image pixel values.
 - `NormalizeBox`: Normalize the bounding box.
 - `Permute`: Arrange the channels of the image and optionally convert image to BGR format.
- `MixupImage`: Mixup two images with given fraction<sup>[1](#vd)</sup>.
+- `MixupImage`: Mixup two images with given fraction<sup>[1](#mix)</sup>.
 <a name="mix">[1]</a> Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf)。

--- a/docs/DATA_cn.md
+++ b/docs/DATA_cn.md
@@ -105,9 +105,9 @@ python ./ppdet/data/tools/generate_data_for_training.py
 4. 数据获取接口  
     为方便训练时的数据获取，我们将多个`data.Dataset`组合在一起构成一个`data.Reader`为用户提供数据，用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。
-主要的APIs如下：
+### APIs
+主要的APIs如下：
 1. 数据解析  

--- a/ppdet/utils/download.py
+++ b/ppdet/utils/download.py
@@ -60,6 +60,17 @@ DATASETS = {
            'http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar',
            'b6e924de25625d8de591ea690078ad9f', ),
    ], ["VOCdevkit/VOC_all"]),
+    'wider_face': ([
+        (
+            'https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip',
+            '3fedf70df600953d25982bcd13d91ba2', ),
+        (
+            'https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip',
+            'dfa7d7e790efa35df3788964cf0bbaea', ),
+        (
+            'https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip',
+            'a4a898d6193db4b9ef3260a68bad0dc7', ),
+    ], ["WIDER_train", "WIDER_val", "wider_face_split"]),
    'fruit': ([
        (
            'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar',
@@ -114,7 +125,8 @@ def get_dataset_path(path, annotation, image_dir):
    # not match any dataset in DATASETS
    raise ValueError("Dataset {} is not valid and cannot parse dataset type "
                     "'{}' for automaticly downloading, which only supports "
-                     "'voc' and 'coco' currently".format(path, osp.split(path)[-1]))
+                     "'voc' and 'coco' currently".format(path,
+                                                         osp.split(path)[-1]))
 def _merge_voc_dir(data_dir, output_subdir):
@@ -201,7 +213,7 @@ def _dataset_exists(path, annotation, image_dir):
    """
    if not osp.exists(path):
        logger.info("Config dataset_dir {} is not exits, "
-                "dataset config is not valid".format(path))
+                    "dataset config is not valid".format(path))
        return False
    if annotation:
@@ -324,7 +336,7 @@ def _decompress(fname):
 def _move_and_merge_tree(src, dst):
    """
-    Move src directory to dst, if dst is already exists, 
+    Move src directory to dst, if dst is already exists,
    merge src to dst
    """
    if not osp.exists(dst):