提交 28c90f86 编写于 作者: G Guanghua Yu 提交者: qingqing01

Add Face detection doc (#3587)

* add face doc
* update face detection readme and add demo jpg
上级 8116ac6f
English | [简体中文](README_cn.md)
# FaceDetection
The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
including cutting-edge and classic models.
<div align="center">
<img src="../../demo/output/12_Group_Group_12_Group_Group_12_935.jpg" />
</div>
## Data Pipline
We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
and testing of the model, the official website gives detailed data introduction.
- WIDER Face data source:
Loads `wider_face` type dataset with directory structures like this:
```
dataset/wider_face/
├── wider_face_split
│ ├── wider_face_train_bbx_gt.txt
│ ├── wider_face_val_bbx_gt.txt
├── WIDER_train
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_100.jpg
│ │ │ ├── 0_Parade_marchingband_1_381.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
├── WIDER_val
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_1004.jpg
│ │ │ ├── 0_Parade_marchingband_1_1045.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
```
- Download dataset manually:
On the other hand, to download the WIDER FACE dataset, run the following commands:
```
cd dataset/wider_face && ./download.sh
```
- Download dataset automatically:
If a training session is started but the dataset is not setup properly
(e.g, not found in dataset/wider_face), PaddleDetection can automatically
download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
automatically subsequently.
### Data Augmentation
- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
according to the randomly selected face height and width, and judge the value of `v` in which interval of
`[16,32,64,128]`. Assuming `v=45` && `32<v<64`, and any value of `[16,32,64]` is selected with a probability
of uniform distribution. If `64` is selected, the face's interval is selected in `[64 / 2, min(v * 2, 64 * 2)]`.
- **Other methods:** Including `RandomDistort`,`ExpandImage`,`RandomInterpImage`,`RandomFlipImage` etc.
Please refer to [DATA.md](../../docs/DATA.md#APIs) for details.
## Benchmark and Model Zoo
Supported architectures is shown in the below table, please refer to
[Algorithm Description](#Algorithm-Description) for details of the algorithm.
| | Original | Lite <sup>[1](#lite)</sup> | NAS <sup>[2](#nas)</sup> |
|:------------------------:|:--------:|:--------------------------:|:------------------------:|
| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ |
| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x |
<a name="lite">[1]</a> `Lite` edition means reduces the number of network layers and channels.
<a name="nas">[2]</a> `NAS` edition means use `Neural Architecture Search` algorithm to
optimized network structure.
**Todo List:**
- [ ] HamBox
- [ ] Pyramidbox
### Model Zoo
#### mAP in WIDER FACE
| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set |
|:------------:|:--------:|:----:|:-------:|:-------:|:--------:|:----------:|:--------:|
| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** |
| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 |
| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 |
| FaceBoxes | Original | 640 | 8 | 32w | 0.875 | 0.848 | 0.568 |
| FaceBoxes | Lite | 640 | 8 | 32w | 0.898 | 0.872 | 0.752 |
**NOTES:**
- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml)
configs file and set `lite_edition: true`.
#### mAP in FDDB
| Architecture | Type | Size | DistROC | ContROC |
|:------------:|:--------:|:----:|:-------:|:-------:|
| BlazeFace | Original | 640 | **0.992** | **0.762** |
| BlazeFace | Lite | 640 | 0.990 | 0.756 |
| BlazeFace | NAS | 640 | 0.981 | 0.741 |
| FaceBoxes | Original | 640 | 0.985 | 0.731 |
| FaceBoxes | Lite | 640 | 0.987 | 0.741 |
**NOTES:**
- Get mAP by multi-scale evaluation on the FDDB dataset.
For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
#### Infer Time and Model Size comparison
| Architecture | Type | Size | P4 (ms) | CPU (ms) | ARM (ms) | File size (MB) | Flops |
|:------------:|:--------:|:----:|:---------:|:--------:|:----------:|:--------------:|:---------:|
| BlazeFace | Original | 128 | - | - | - | - | - |
| BlazeFace | Lite | 128 | - | - | - | - | - |
| BlazeFace | NAS | 128 | - | - | - | - | - |
| FaceBoxes | Original | 128 | - | - | - | - | - |
| FaceBoxes | Lite | 128 | - | - | - | - | - |
| BlazeFace | Original | 320 | - | - | - | - | - |
| BlazeFace | Lite | 320 | - | - | - | - | - |
| BlazeFace | NAS | 320 | - | - | - | - | - |
| FaceBoxes | Original | 320 | - | - | - | - | - |
| FaceBoxes | Lite | 320 | - | - | - | - | - |
| BlazeFace | Original | 640 | - | - | - | - | - |
| BlazeFace | Lite | 640 | - | - | - | - | - |
| BlazeFace | NAS | 640 | - | - | - | - | - |
| FaceBoxes | Original | 640 | - | - | - | - | - |
| FaceBoxes | Lite | 640 | - | - | - | - | - |
**NOTES:**
- CPU: i5-7360U @ 2.30GHz. Single core and single thread.
## Get Started
`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
- **NOTES:** Currently we do not support evaluation in training.
### Evaluation
```
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python tools/face_eval.py -c configs/face_detection/blazeface.yml
```
- Optional arguments
- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`.
- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
After the evaluation is completed, the test result in txt format will be generated in `output/pred`,
and then mAP will be calculated according to different data sets:
#### Evaluate on the WIDER FACE
- Download the official evaluation script to evaluate the AP metrics:
```
wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
unzip eval_tools.zip && rm -f eval_tools.zip
```
- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`:
```
# Modify the folder name where the result is stored.
pred_dir = './pred';
# Modify the name of the curve to be drawn
legend_name = 'Fluid-BlazeFace';
```
- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows:
```
matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
```
#### Evaluate on the FDDB
- Download the official dataset and evaluation script to evaluate the ROC metrics:
```
#external link to the Faces in the Wild data set
wget http://tamaraberg.com/faceDataset/originalPics.tar.gz
#The annotations are split into ten folds. See README for details.
wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz
#information on directory structure and file formats
wget http://vis-www.cs.umass.edu/fddb/README.txt
```
- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/)
If the utility 'pkg-config' is not available for your operating system,
edit the Makefile to manually specify the OpenCV flags as following:
```
INCS = -I/usr/local/include/opencv
LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml
```
- Compile FDDB evaluation code: execute `make` in evaluation folder.
- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows:
```
cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt`
```
- Evaluation
Finally evaluation command is:
```
./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \
-d DETECTION_RESULT.txt -f 0 \
-i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \
-r ./OUTPUT_DIR -z .jpg
```
**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`.
## Algorithm Description
### BlazeFace
**Introduction:**
[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model.
It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed
of 200-1000+ FPS on flagship devices.
**Particularity:**
- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
- Replace the non-maximum suppression algorithm with a blending strategy that estimates the
regression parameters of a bounding box as a weighted mean between the overlapping predictions.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
- NAS: use `Neural Architecture Search` algorithm to optimized network structure,
less network layer and conv channel number than `Lite`.
### FaceBoxes
**Introduction:**
[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
both speed and accuracy. This paper is published by IJCB(2017).
**Particularity:**
- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640,
including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities
are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
- Use density prior box to improve detection accuracy.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU.
Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution.
The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.
## Contributing
Contributions are highly welcomed and we would really appreciate your feedback!!
...@@ -89,7 +89,7 @@ SSDEvalFeed: ...@@ -89,7 +89,7 @@ SSDEvalFeed:
fields: ['image', 'im_id', 'gt_box'] fields: ['image', 'im_id', 'gt_box']
dataset: dataset:
dataset_dir: dataset/wider_face dataset_dir: dataset/wider_face
annotation: annotFile.txt #wider_face_split/wider_face_val_bbx_gt.txt annotation: wider_face_split/wider_face_val_bbx_gt.txt
image_dir: WIDER_val/images image_dir: WIDER_val/images
drop_last: false drop_last: false
image_shape: [3, 640, 640] image_shape: [3, 640, 640]
......
# All rights `PaddleDetection` reserved
# References:
# @inproceedings{yang2016wider,
# Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},
# Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
# Title = {WIDER FACE: A Face Detection Benchmark},
# Year = {2016}}
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"
# Download the data.
echo "Downloading..."
wget https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip
wget https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip
wget https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip
# Extract the data.
echo "Extracting..."
unzip WIDER_train.zip
unzip WIDER_val.zip
unzip wider_face_split.zip
...@@ -126,6 +126,8 @@ the corresponding data stream. Many aspect of the `Reader`, such as storage ...@@ -126,6 +126,8 @@ the corresponding data stream. Many aspect of the `Reader`, such as storage
location, preprocessing pipeline, acceleration mode can be configured with yaml location, preprocessing pipeline, acceleration mode can be configured with yaml
files. files.
### APIs
The main APIs are as follows: The main APIs are as follows:
1. Data parsing 1. Data parsing
...@@ -139,7 +141,7 @@ The main APIs are as follows: ...@@ -139,7 +141,7 @@ The main APIs are as follows:
- `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py) - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py)
2. Operator 2. Operator
`transform/operators.py`: Contains a variety of data enhancement methods, including: `transform/operators.py`: Contains a variety of data augmentation methods, including:
- `DecodeImage`: Read images in RGB format. - `DecodeImage`: Read images in RGB format.
- `RandomFlipImage`: Horizontal flip. - `RandomFlipImage`: Horizontal flip.
- `RandomDistort`: Distort brightness, contrast, saturation, and hue. - `RandomDistort`: Distort brightness, contrast, saturation, and hue.
...@@ -150,7 +152,7 @@ The main APIs are as follows: ...@@ -150,7 +152,7 @@ The main APIs are as follows:
- `NormalizeImage`: Normalize image pixel values. - `NormalizeImage`: Normalize image pixel values.
- `NormalizeBox`: Normalize the bounding box. - `NormalizeBox`: Normalize the bounding box.
- `Permute`: Arrange the channels of the image and optionally convert image to BGR format. - `Permute`: Arrange the channels of the image and optionally convert image to BGR format.
- `MixupImage`: Mixup two images with given fraction<sup>[1](#vd)</sup>. - `MixupImage`: Mixup two images with given fraction<sup>[1](#mix)</sup>.
<a name="mix">[1]</a> Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf) <a name="mix">[1]</a> Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf)
......
...@@ -105,9 +105,9 @@ python ./ppdet/data/tools/generate_data_for_training.py ...@@ -105,9 +105,9 @@ python ./ppdet/data/tools/generate_data_for_training.py
4. 数据获取接口 4. 数据获取接口
为方便训练时的数据获取,我们将多个`data.Dataset`组合在一起构成一个`data.Reader`为用户提供数据,用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。 为方便训练时的数据获取,我们将多个`data.Dataset`组合在一起构成一个`data.Reader`为用户提供数据,用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。
主要的APIs如下: ### APIs
主要的APIs如下:
1. 数据解析 1. 数据解析
......
...@@ -60,6 +60,17 @@ DATASETS = { ...@@ -60,6 +60,17 @@ DATASETS = {
'http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar', 'http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar',
'b6e924de25625d8de591ea690078ad9f', ), 'b6e924de25625d8de591ea690078ad9f', ),
], ["VOCdevkit/VOC_all"]), ], ["VOCdevkit/VOC_all"]),
'wider_face': ([
(
'https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip',
'3fedf70df600953d25982bcd13d91ba2', ),
(
'https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip',
'dfa7d7e790efa35df3788964cf0bbaea', ),
(
'https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip',
'a4a898d6193db4b9ef3260a68bad0dc7', ),
], ["WIDER_train", "WIDER_val", "wider_face_split"]),
'fruit': ([ 'fruit': ([
( (
'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar', 'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar',
...@@ -114,7 +125,8 @@ def get_dataset_path(path, annotation, image_dir): ...@@ -114,7 +125,8 @@ def get_dataset_path(path, annotation, image_dir):
# not match any dataset in DATASETS # not match any dataset in DATASETS
raise ValueError("Dataset {} is not valid and cannot parse dataset type " raise ValueError("Dataset {} is not valid and cannot parse dataset type "
"'{}' for automaticly downloading, which only supports " "'{}' for automaticly downloading, which only supports "
"'voc' and 'coco' currently".format(path, osp.split(path)[-1])) "'voc' and 'coco' currently".format(path,
osp.split(path)[-1]))
def _merge_voc_dir(data_dir, output_subdir): def _merge_voc_dir(data_dir, output_subdir):
...@@ -201,7 +213,7 @@ def _dataset_exists(path, annotation, image_dir): ...@@ -201,7 +213,7 @@ def _dataset_exists(path, annotation, image_dir):
""" """
if not osp.exists(path): if not osp.exists(path):
logger.info("Config dataset_dir {} is not exits, " logger.info("Config dataset_dir {} is not exits, "
"dataset config is not valid".format(path)) "dataset config is not valid".format(path))
return False return False
if annotation: if annotation:
...@@ -324,7 +336,7 @@ def _decompress(fname): ...@@ -324,7 +336,7 @@ def _decompress(fname):
def _move_and_merge_tree(src, dst): def _move_and_merge_tree(src, dst):
""" """
Move src directory to dst, if dst is already exists, Move src directory to dst, if dst is already exists,
merge src to dst merge src to dst
""" """
if not osp.exists(dst): if not osp.exists(dst):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册