English | [简体中文](FACE_DETECTION.md)
# FaceDetection
## Table of Contents
- [Introduction](#Introduction)
- [Benchmark and Model Zoo](#Benchmark-and-Model-Zoo)
- [Quick Start](#Quick-Start)
- [Data Pipline](#Data-Pipline)
- [Training and Inference](#Training-and-Inference)
- [Evaluation](#Evaluation)
- [Algorithm Description](#Algorithm-Description)
- [Contributing](#Contributing)
## Introduction
The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
including cutting-edge and classic models.
![](../images/12_Group_Group_12_Group_Group_12_935.jpg)
## Benchmark and Model Zoo
PaddleDetection Supported architectures is shown in the below table, please refer to
[Algorithm Description](#Algorithm-Description) for details of the algorithm.
| | Original | Lite [1](#lite) | NAS [2](#nas) |
|:------------------------:|:--------:|:--------------------------:|:------------------------:|
| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ |
| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x |
[1] `Lite` edition means reduces the number of network layers and channels.
[2] `NAS` edition means use `Neural Architecture Search` algorithm to
optimized network structure.
### Model Zoo
#### mAP in WIDER FACE
| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | Download |
|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
| BlazeFace | NAS_V2 | 640 | 8 | 32W | 0.870 | 0.837 | 0.685 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas2.tar)
| FaceBoxes | Original | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) |
| FaceBoxes | Lite | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) |
**NOTES:**
- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
- BlazeFace-Lite Training and Testing ues [blazeface.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/master/configs/face_detection/blazeface.yml)
configs file and set `lite_edition: true`.
#### mAP in FDDB
| Architecture | Type | Size | DistROC | ContROC |
|:------------:|:--------:|:----:|:-------:|:-------:|
| BlazeFace | Original | 640 | **0.992** | **0.762** |
| BlazeFace | Lite | 640 | 0.990 | 0.756 |
| BlazeFace | NAS | 640 | 0.981 | 0.741 |
| FaceBoxes | Original | 640 | 0.987 | 0.736 |
| FaceBoxes | Lite | 640 | 0.988 | 0.751 |
**NOTES:**
- Get mAP by multi-scale evaluation on the FDDB dataset.
For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
#### Infer Time and Model Size comparison
| Architecture | Type | Size | P4(trt32) (ms) | CPU (ms) | Qualcomm SnapDragon 855(armv8) (ms) | Model size (MB) |
|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:|
| BlazeFace | Original | 128 | 1.387 | 23.461 | 6.036 | 0.777 |
| BlazeFace | Lite | 128 | 1.323 | 12.802 | 6.193 | 0.68 |
| BlazeFace | NAS | 128 | 1.03 | 6.714 | 2.7152 | 0.234 |
| FaceBoxes | Original | 128 | 3.144 | 14.972 | 19.2196 | 3.6 |
| FaceBoxes | Lite | 128 | 2.295 | 11.276 | 8.5278 | 2 |
| BlazeFace | Original | 320 | 3.01 | 132.408 | 70.6916 | 0.777 |
| BlazeFace | Lite | 320 | 2.535 | 69.964 | 69.9438 | 0.68 |
| BlazeFace | NAS | 320 | 2.392 | 36.962 | 39.8086 | 0.234 |
| FaceBoxes | Original | 320 | 7.556 | 84.531 | 52.1022 | 3.6 |
| FaceBoxes | Lite | 320 | 18.605 | 78.862 | 59.8996 | 2 |
| BlazeFace | Original | 640 | 8.885 | 519.364 | 149.896 | 0.777 |
| BlazeFace | Lite | 640 | 6.988 | 284.13 | 149.902 | 0.68 |
| BlazeFace | NAS | 640 | 7.448 | 142.91 | 69.8266 | 0.234 |
| FaceBoxes | Original | 640 | 78.201 | 394.043 | 169.877 | 3.6 |
| FaceBoxes | Lite | 640 | 59.47 | 313.683 | 139.918 | 2 |
**NOTES:**
- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz.
- P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.6.1.
- ARM test environment:
- Qualcomm SnapDragon 855(armv8);
- Single thread;
- Paddle-Lite version 2.0.0.
## Quick Start
### Data Pipline
We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
and testing of the model, the official website gives detailed data introduction.
- WIDER Face data source:
Loads `wider_face` type dataset with directory structures like this:
```
dataset/wider_face/
├── wider_face_split
│ ├── wider_face_train_bbx_gt.txt
│ ├── wider_face_val_bbx_gt.txt
├── WIDER_train
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_100.jpg
│ │ │ ├── 0_Parade_marchingband_1_381.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
├── WIDER_val
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_1004.jpg
│ │ │ ├── 0_Parade_marchingband_1_1045.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
```
- Download dataset manually:
To download the WIDER FACE dataset, run the following commands:
```
cd dataset/wider_face && ./download.sh
```
- Download dataset automatically:
If a training session is started but the dataset is not setup properly
(e.g, not found in dataset/wider_face), PaddleDetection can automatically
download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
automatically subsequently.
#### Data Augmentation
- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
according to the randomly selected face height and width, and judge the value of `v` in which interval of
`[16,32,64,128]`. Assuming `v=45` && `32