提交 e7562558 编写于 作者: G Guanghua Yu 提交者: qingqing01

Add chinse doc for face detection and fix get_shrink for large image (#123)

- 1.add cn docs
- 2.fix fddb evaluate
- 3.fix get_shrink
- 4.delete FLAGS.dataset_dir
上级 4ce9ead4
......@@ -136,4 +136,3 @@ PaddleDetection的目的是为工业界和学术界提供丰富、易用的目
## 如何贡献代码
我们非常欢迎你可以为PaddleDetection提供代码,也十分感谢你的反馈。
✗ |
[English](README_en.md) | 简体中文
# FaceDetection
The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
including cutting-edge and classic models.
## 内容
- [简介](#简介)
- [模型库与基线](#模型库与基线)
- [快速开始](#快速开始)
- [数据准备](#数据准备)
- [训练与推理](#训练与推理)
- [评估](#评估)
- [算法细节](#算法细节)
- [如何贡献代码](#如何贡献代码)
## 简介
FaceDetection的目标是提供高效、高速的人脸检测解决方案,包括最先进的模型和经典模型。
<div align="center">
<img src="../../demo/output/12_Group_Group_12_Group_Group_12_935.jpg" />
</div>
## Data Pipline
We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
and testing of the model, the official website gives detailed data introduction.
- WIDER Face data source:
Loads `wider_face` type dataset with directory structures like this:
## 模型库与基线
下表中展示了PaddleDetection当前支持的网络结构,具体细节请参考[算法细节](#算法细节)
| | 原始版本 | Lite版本 <sup>[1](#lite)</sup> | NAS版本 <sup>[2](#nas)</sup> |
|:------------------------:|:--------:|:--------------------------:|:------------------------:|
| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ |
| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x |
<a name="lite">[1]</a> `Lite版本`表示减少网络层数和通道数。
<a name="nas">[2]</a> `NA版本`表示使用 `神经网络搜索`方法来构建网络结构。
### 模型库
#### WIDER-FACE数据集上的mAP
| 网络结构 | 类型 | 输入尺寸 | 图片个数/GPU | 学习率策略 | Easy Set | Medium Set | Hard Set | 下载 |
|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
| BlazeFace | 原始版本 | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
| BlazeFace | Lite版本 | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
| BlazeFace | NAS版本 | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
| FaceBoxes | 原始版本 | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) |
| FaceBoxes | Lite版本 | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) |
**注意:**
- 我们使用`tools/face_eval.py`中多尺度评估策略得到`Easy/Medium/Hard Set`里的mAP。具体细节请参考[在WIDER-FACE数据集上评估](#在WIDER-FACE数据集上评估)
- BlazeFace-Lite的训练与测试使用 [blazeface.yml](../../configs/face_detection/blazeface.yml)配置文件并且设置:`lite_edition: true`
#### FDDB数据集上的mAP
| 网络结构 | Type | Size | DistROC | ContROC |
|:------------:|:--------:|:----:|:-------:|:-------:|
| BlazeFace | 原始版本 | 640 | **0.992** | **0.762** |
| BlazeFace | Lite版本 | 640 | 0.990 | 0.756 |
| BlazeFace | NAS版本 | 640 | 0.981 | 0.741 |
| FaceBoxes | 原始版本 | 640 | 0.987 | 0.736 |
| FaceBoxes | Lite版本 | 640 | 0.988 | 0.751 |
**注意:**
- 我们在FDDB数据集上使用多尺度测试的方法得到mAP,具体细节请参考[在FDDB数据集上评估](#在FDDB数据集上评估)
#### 推理时间和模型大小比较
| 网络结构 | 类型 | 输入尺寸 | P4(trt32) (ms) | CPU (ms) | 高通骁龙855(armv8) (ms) | 模型大小(MB) |
|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:|
| BlazeFace | 原始版本 | 128 | 1.387 | 23.461 | 6.036 | 0.777 |
| BlazeFace | Lite版本 | 128 | 1.323 | 12.802 | 6.193 | 0.68 |
| BlazeFace | NAS版本 | 128 | 1.03 | 6.714 | 2.7152 | 0.234 |
| FaceBoxes | 原始版本 | 128 | 3.144 | 14.972 | 19.2196 | 3.6 |
| FaceBoxes | Lite版本 | 128 | 2.295 | 11.276 | 8.5278 | 2 |
| BlazeFace | 原始版本 | 320 | 3.01 | 132.408 | 70.6916 | 0.777 |
| BlazeFace | Lite版本 | 320 | 2.535 | 69.964 | 69.9438 | 0.68 |
| BlazeFace | NAS版本 | 320 | 2.392 | 36.962 | 39.8086 | 0.234 |
| FaceBoxes | 原始版本 | 320 | 7.556 | 84.531 | 52.1022 | 3.6 |
| FaceBoxes | Lite版本 | 320 | 18.605 | 78.862 | 59.8996 | 2 |
| BlazeFace | 原始版本 | 640 | 8.885 | 519.364 | 149.896 | 0.777 |
| BlazeFace | Lite版本 | 640 | 6.988 | 284.13 | 149.902 | 0.68 |
| BlazeFace | NAS版本 | 640 | 7.448 | 142.91 | 69.8266 | 0.234 |
| FaceBoxes | 原始版本 | 640 | 78.201 | 394.043 | 169.877 | 3.6 |
| FaceBoxes | Lite版本 | 640 | 59.47 | 313.683 | 139.918 | 2 |
**注意:**
- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz。
- P4(trt32)和CPU的推理时间测试基于PaddlePaddle-1.6.1版本。
- ARM测试环境:
- 高通骁龙855(armv8);
- 单线程;
- Paddle-Lite 2.0.0版本。
## 快速开始
### 数据准备
我们使用[WIDER-FACE数据集](http://shuoyang1213.me/WIDERFACE/)进行训练和模型测试,官方网站提供了详细的数据介绍。
- WIDER-Face数据源:
使用如下目录结构加载`wider_face`类型的数据集:
```
dataset/wider_face/
......@@ -36,228 +118,159 @@ Loads `wider_face` type dataset with directory structures like this:
│ │ │ ...
```
- Download dataset manually:
To download the WIDER FACE dataset, run the following commands:
- 手动下载数据集:
要下载WIDER-FACE数据集,请运行以下命令:
```
cd dataset/wider_face && ./download.sh
```
- Download dataset automatically:
If a training session is started but the dataset is not setup properly
(e.g, not found in dataset/wider_face), PaddleDetection can automatically
download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
automatically subsequently.
- 自动下载数据集:
如果已经开始训练,但是数据集路径设置不正确或找不到路径, PaddleDetection会从[WIDER-FACE数据集](http://shuoyang1213.me/WIDERFACE/)自动下载它们,
下载后解压的数据集将缓存在`~/.cache/paddle/dataset/`中,并且之后的训练测试会自动加载它们。
### Data Augmentation
#### 数据增强方法
- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
according to the randomly selected face height and width, and judge the value of `v` in which interval of
`[16,32,64,128]`. Assuming `v=45` && `32<v<64`, and any value of `[16,32,64]` is selected with a probability
of uniform distribution. If `64` is selected, the face's interval is selected in `[64 / 2, min(v * 2, 64 * 2)]`.
- **尺度变换(Data-anchor-sampling):**
具体操作是:根据随机选择的人脸高和宽,获取到$v=\sqrt{width * height}$,之后再判断`v`的值范围,其中`v`值位于缩放区间`[16,32,64,128]`
假设`v=45`,则选定`32<v<64`, 以均匀分布的概率选取`[16,32,64]`中的任意一个值。若选中`64`,则该人脸的缩放区间在`[64 / 2, min(v * 2, 64 * 2)]`中选定。
- **Other methods:** Including `RandomDistort`,`ExpandImage`,`RandomInterpImage`,`RandomFlipImage` etc.
Please refer to [DATA.md](../../docs/DATA.md#APIs) for details.
- **其他方法:** 包括随机扰动、翻转、裁剪等。具体请参考[DATA_cn.md](../../docs/DATA_cn.md#APIs)
### 训练与推理
训练流程与推理流程方法与其他算法一致,请参考[GETTING_STARTED_cn.md](../../docs/GETTING_STARTED_cn.md)
**注意:**
- `BlazeFace``FaceBoxes`训练是以每卡`batch_size=8`在4卡GPU上进行训练(总`batch_size`是32),并且训练320000轮
(如果你的GPU数达不到4,请参考[学习率计算规则表](../../docs/GETTING_STARTED_cn.md#faq))。
- 人脸检测模型目前我们不支持边训练边评估。
## Benchmark and Model Zoo
Supported architectures is shown in the below table, please refer to
[Algorithm Description](#Algorithm-Description) for details of the algorithm.
| | Original | Lite <sup>[1](#lite)</sup> | NAS <sup>[2](#nas)</sup> |
|:------------------------:|:--------:|:--------------------------:|:------------------------:|
| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ |
| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x |
<a name="lite">[1]</a> `Lite` edition means reduces the number of network layers and channels.
<a name="nas">[2]</a> `NAS` edition means use `Neural Architecture Search` algorithm to
optimized network structure.
**Todo List:**
- [ ] HamBox
- [ ] Pyramidbox
### 评估
目前我们支持在`WIDER FACE`数据集和`FDDB`数据集上评估。首先运行`tools/face_eval.py`生成评估结果文件,其次使用matlab(WIDER FACE)
或OpenCV(FDDB)计算具体的评估指标。
其中,运行`tools/face_eval.py`的参数列表如下:
- `-f` 或者 `--output_eval`: 评估生成的结果文件保存路径,默认是: `output/pred`
- `-e` 或者 `--eval_mode`: 评估模式,包括 `widerface``fddb`,默认是`widerface`
- `--multi_scale`: 如果在命令中添加此操作按钮,它将选择多尺度评估。默认值为`False`,它将选择单尺度评估。
### Model Zoo
#### mAP in WIDER FACE
| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | Download |
|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
| FaceBoxes | Original | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) |
| FaceBoxes | Lite | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) |
**NOTES:**
- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml)
configs file and set `lite_edition: true`.
#### mAP in FDDB
| Architecture | Type | Size | DistROC | ContROC |
|:------------:|:--------:|:----:|:-------:|:-------:|
| BlazeFace | Original | 640 | **0.992** | **0.762** |
| BlazeFace | Lite | 640 | 0.990 | 0.756 |
| BlazeFace | NAS | 640 | 0.981 | 0.741 |
| FaceBoxes | Original | 640 | 0.987 | 0.736 |
| FaceBoxes | Lite | 640 | 0.988 | 0.751 |
**NOTES:**
- Get mAP by multi-scale evaluation on the FDDB dataset.
For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
#### Infer Time and Model Size comparison
| Architecture | Type | Size | P4(trt32) (ms) | CPU (ms) | Qualcomm SnapDragon 855(armv8) (ms) | Model size (MB) |
|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:|
| BlazeFace | Original | 128 | 1.387 | 23.461 | 6.036 | 0.777 |
| BlazeFace | Lite | 128 | 1.323 | 12.802 | 6.193 | 0.68 |
| BlazeFace | NAS | 128 | 1.03 | 6.714 | 2.7152 | 0.234 |
| FaceBoxes | Original | 128 | 3.144 | 14.972 | 19.2196 | 3.6 |
| FaceBoxes | Lite | 128 | 2.295 | 11.276 | 8.5278 | 2 |
| BlazeFace | Original | 320 | 3.01 | 132.408 | 70.6916 | 0.777 |
| BlazeFace | Lite | 320 | 2.535 | 69.964 | 69.9438 | 0.68 |
| BlazeFace | NAS | 320 | 2.392 | 36.962 | 39.8086 | 0.234 |
| FaceBoxes | Original | 320 | 7.556 | 84.531 | 52.1022 | 3.6 |
| FaceBoxes | Lite | 320 | 18.605 | 78.862 | 59.8996 | 2 |
| BlazeFace | Original | 640 | 8.885 | 519.364 | 149.896 | 0.777 |
| BlazeFace | Lite | 640 | 6.988 | 284.13 | 149.902 | 0.68 |
| BlazeFace | NAS | 640 | 7.448 | 142.91 | 69.8266 | 0.234 |
| FaceBoxes | Original | 640 | 78.201 | 394.043 | 169.877 | 3.6 |
| FaceBoxes | Lite | 640 | 59.47 | 313.683 | 139.918 | 2 |
**NOTES:**
- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
- P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.6.1
- ARM test environment:
- Qualcomm SnapDragon 855(armv8)
- Single thread
- Paddle-Lite version 2.0.0
## Get Started
`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
- **NOTES:**
- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32)
and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters
in the table of [calculation rules](../../docs/GETTING_STARTED.md#faq))
- Currently we do not support evaluation in training.
### Evaluation
#### 在WIDER-FACE数据集上评估
评估并生成结果文件:
```
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python tools/face_eval.py -c configs/face_detection/blazeface.yml
python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
-o weights=output/blazeface/model_final/ \
--eval_mode=widerface
```
- Optional arguments
- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`.
- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation.
Default is `False`, it will select `single-scale` evaluation.
After the evaluation is completed, the test result in txt format will be generated in `output/pred`,
and then mAP will be calculated according to different data sets. If you set `--eval_mode=widerface`,
it will [Evaluate on the WIDER FACE](#Evaluate-on-the-WIDER-FACE).If you set `--eval_mode=fddb`,
it will [Evaluate on the FDDB](#Evaluate-on-the-FDDB).
#### Evaluate on the WIDER FACE
- Download the official evaluation script to evaluate the AP metrics:
评估完成后,将在`output/pred`中生成txt格式的测试结果。
- 下载官方评估脚本来评估AP指标:
```
wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
unzip eval_tools.zip && rm -f eval_tools.zip
```
- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`:
- `eval_tools/wider_eval.m`中修改保存结果路径和绘制曲线的名称:
```
# Modify the folder name where the result is stored.
pred_dir = './pred';
# Modify the name of the curve to be drawn
legend_name = 'Fluid-BlazeFace';
```
- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows:
- `wider_eval.m` 是评估模块的主要执行程序。运行命令如下:
```
matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
```
#### Evaluate on the FDDB
[FDDB dataset](http://vis-www.cs.umass.edu/fddb/) details can refer to FDDB's official website.
- Download the official dataset and evaluation script to evaluate the ROC metrics:
#### 在FDDB数据集上评估
我们提供了一套FDDB数据集的评估流程(目前仅支持Linux系统),其他具体细节请参考[FDDB官网](http://vis-www.cs.umass.edu/fddb/)
- 1)下载安装opencv:
下载OpenCV: 进入[OpenCV library](https://opencv.org/releases/)手动下载
安装OpenCV:请参考[OpenCV官方安装教程](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html)通过源码安装。
- 2)下载数据集、评估代码以及格式化数据:
```
#external link to the Faces in the Wild data set
wget http://tamaraberg.com/faceDataset/originalPics.tar.gz
#The annotations are split into ten folds. See README for details.
wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz
#information on directory structure and file formats
wget http://vis-www.cs.umass.edu/fddb/README.txt
./dataset/fddb/download.sh
```
- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/)
If the utility 'pkg-config' is not available for your operating system,
edit the Makefile to manually specify the OpenCV flags as following:
- 3)编译FDDB评估代码:
进入`dataset/fddb/evaluation`目录下,修改MakeFile文件中内容如下:
```
INCS = -I/usr/local/include/opencv
LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml
evaluate: $(OBJS)
$(CC) $(OBJS) -o $@ $(LIBS)
```
修改`common.hpp`中内容为如下形式:
```
#define __IMAGE_FORMAT__ ".jpg"
//#define __IMAGE_FORMAT__ ".ppm"
#define __CVLOADIMAGE_WORKING__
```
根据`grep -r "CV_RGB"`命令找到含有`CV_RGB`的代码段,将`CV_RGB`改成`Scalar`,并且在cpp中加入`using namespace cv;`
然后编译:
```
make clean && make
```
- Compile FDDB evaluation code: execute `make` in evaluation folder.
- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows:
- 4)开始评估:
修改config文件中`dataset_dir``annotation`字段内容:
```
EvalReader:
...
dataset:
dataset_dir: dataset/fddb
annotation: FDDB-folds/fddb_annotFile.txt
...
```
评估并生成结果文件:
```
cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt`
python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
-o weights=output/blazeface/model_final/ \
--eval_mode=fddb
```
- Evaluation
Finally evaluation command is:
评估完成后,将在`output/pred/pred_fddb_res.txt`中生成txt格式的测试结果。
生成ContROC与DiscROC数据:
```
./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \
-d DETECTION_RESULT.txt -f 0 \
-i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \
-r ./OUTPUT_DIR -z .jpg
cd dataset/fddb/evaluation
./evaluate -a ./FDDB-folds/fddb_annotFile.txt \
-f 0 -i ./ -l ./FDDB-folds/filePath.txt -z .jpg \
-d {RESULT_FILE} \
-r {OUTPUT_DIR}
```
**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`.
**注意:**
(1)`RESULT_FILE``tools/face_eval.py`输出的FDDB预测结果文件;
(2)`OUTPUT_DIR`是FDDB评估输出结果文件前缀,会生成两个文件`{OUTPUT_DIR}ContROC.txt``{OUTPUT_DIR}DiscROC.txt`
(3)参数用法及注释可通过执行`./evaluate --help`来获取。
## Algorithm Description
## 算法细节
### BlazeFace
**Introduction:**
[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model.
It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed
of 200-1000+ FPS on flagship devices.
**Particularity:**
- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
- Replace the non-maximum suppression algorithm with a blending strategy that estimates the
regression parameters of a bounding box as a weighted mean between the overlapping predictions.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
- NAS: use `Neural Architecture Search` algorithm to optimized network structure,
less network layer and conv channel number than `Lite`.
**简介:**
[BlazeFace](https://arxiv.org/abs/1907.05047) 是Google Research发布的人脸检测模型。它轻巧并且性能良好,
专为移动GPU推理量身定制。在旗舰设备上,速度可达到200-1000+FPS。
**特点:**
- 锚点策略在8×8(输入128x128)的特征图上停止,在该分辨率下每个像素点6个锚点;
- 5个单BlazeBlock和6个双BlazeBlock:5×5 depthwise卷积,可以保证在相同精度下网络层数更少;
- 用混合策略替换非极大值抑制算法,该策略将边界框的回归参数估计为重叠预测之间的加权平均值。
**版本信息:**
- 原始版本: 参考原始论文复现;
- Lite版本: 使用3x3卷积替换5x5卷积,更少的网络层数和通道数;
- NAS版本: 使用神经网络搜索算法构建网络结构,相比于`Lite`版本,NAS版本需要更少的网络层数和通道数。
### FaceBoxes
**Introduction:**
[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
both speed and accuracy. This paper is published by IJCB(2017).
**Particularity:**
- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640,
including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities
are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
- Use density prior box to improve detection accuracy.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU.
Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution.
The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.
## Contributing
Contributions are highly welcomed and we would really appreciate your feedback!!
**简介:**
[FaceBoxes](https://arxiv.org/abs/1708.05234) 由Shifeng Zhang等人提出的高速和高准确率的人脸检测器,
被称为“高精度CPU实时人脸检测器”。 该论文收录于IJCB(2017)。
**特点:**
- 锚点策略分别在20x20、10x10、5x5(输入640x640)执行,每个像素点分别是3、1、1个锚点,对应密度系数是`1, 2, 4`(20x20)、4(10x10)、4(5x5);
- 在基础网络中个别block中使用CReLU和inception的结构;
- 使用密度先验盒(density_prior_box)可提高检测精度。
**版本信息:**
- 原始版本: 参考原始论文复现;
- Lite版本: 使用更少的网络层数和通道数,具体可参考[代码](../../ppdet/modeling/architectures/faceboxes.py)
## 如何贡献代码
我们非常欢迎您可以为PaddleDetection中的人脸检测模型提供代码,您可以提交PR供我们review;也十分感谢您的反馈,可以提交相应issue,我们会及时解答。
English | [简体中文](README.md)
# FaceDetection
## Table of Contents
- [Introduction](#Introduction)
- [Benchmark and Model Zoo](#Benchmark-and-Model-Zoo)
- [Quick Start](#Quick-Start)
- [Data Pipline](#Data-Pipline)
- [Training and Inference](#Training-and-Inference)
- [Evaluation](#Evaluation)
- [Algorithm Description](#Algorithm-Description)
- [Contributing](#Contributing)
## Introduction
The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
including cutting-edge and classic models.
<div align="center">
<img src="../../demo/output/12_Group_Group_12_Group_Group_12_935.jpg" />
</div>
## Benchmark and Model Zoo
PaddleDetection Supported architectures is shown in the below table, please refer to
[Algorithm Description](#Algorithm-Description) for details of the algorithm.
| | Original | Lite <sup>[1](#lite)</sup> | NAS <sup>[2](#nas)</sup> |
|:------------------------:|:--------:|:--------------------------:|:------------------------:|
| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ |
| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x |
<a name="lite">[1]</a> `Lite` edition means reduces the number of network layers and channels.
<a name="nas">[2]</a> `NAS` edition means use `Neural Architecture Search` algorithm to
optimized network structure.
### Model Zoo
#### mAP in WIDER FACE
| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | Download |
|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
| FaceBoxes | Original | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) |
| FaceBoxes | Lite | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) |
**NOTES:**
- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml)
configs file and set `lite_edition: true`.
#### mAP in FDDB
| Architecture | Type | Size | DistROC | ContROC |
|:------------:|:--------:|:----:|:-------:|:-------:|
| BlazeFace | Original | 640 | **0.992** | **0.762** |
| BlazeFace | Lite | 640 | 0.990 | 0.756 |
| BlazeFace | NAS | 640 | 0.981 | 0.741 |
| FaceBoxes | Original | 640 | 0.987 | 0.736 |
| FaceBoxes | Lite | 640 | 0.988 | 0.751 |
**NOTES:**
- Get mAP by multi-scale evaluation on the FDDB dataset.
For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
#### Infer Time and Model Size comparison
| Architecture | Type | Size | P4(trt32) (ms) | CPU (ms) | Qualcomm SnapDragon 855(armv8) (ms) | Model size (MB) |
|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:|
| BlazeFace | Original | 128 | 1.387 | 23.461 | 6.036 | 0.777 |
| BlazeFace | Lite | 128 | 1.323 | 12.802 | 6.193 | 0.68 |
| BlazeFace | NAS | 128 | 1.03 | 6.714 | 2.7152 | 0.234 |
| FaceBoxes | Original | 128 | 3.144 | 14.972 | 19.2196 | 3.6 |
| FaceBoxes | Lite | 128 | 2.295 | 11.276 | 8.5278 | 2 |
| BlazeFace | Original | 320 | 3.01 | 132.408 | 70.6916 | 0.777 |
| BlazeFace | Lite | 320 | 2.535 | 69.964 | 69.9438 | 0.68 |
| BlazeFace | NAS | 320 | 2.392 | 36.962 | 39.8086 | 0.234 |
| FaceBoxes | Original | 320 | 7.556 | 84.531 | 52.1022 | 3.6 |
| FaceBoxes | Lite | 320 | 18.605 | 78.862 | 59.8996 | 2 |
| BlazeFace | Original | 640 | 8.885 | 519.364 | 149.896 | 0.777 |
| BlazeFace | Lite | 640 | 6.988 | 284.13 | 149.902 | 0.68 |
| BlazeFace | NAS | 640 | 7.448 | 142.91 | 69.8266 | 0.234 |
| FaceBoxes | Original | 640 | 78.201 | 394.043 | 169.877 | 3.6 |
| FaceBoxes | Lite | 640 | 59.47 | 313.683 | 139.918 | 2 |
**NOTES:**
- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz.
- P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.6.1.
- ARM test environment:
- Qualcomm SnapDragon 855(armv8);
- Single thread;
- Paddle-Lite version 2.0.0.
## Quick Start
### Data Pipline
We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
and testing of the model, the official website gives detailed data introduction.
- WIDER Face data source:
Loads `wider_face` type dataset with directory structures like this:
```
dataset/wider_face/
├── wider_face_split
│ ├── wider_face_train_bbx_gt.txt
│ ├── wider_face_val_bbx_gt.txt
├── WIDER_train
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_100.jpg
│ │ │ ├── 0_Parade_marchingband_1_381.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
├── WIDER_val
│ ├── images
│ │ ├── 0--Parade
│ │ │ ├── 0_Parade_marchingband_1_1004.jpg
│ │ │ ├── 0_Parade_marchingband_1_1045.jpg
│ │ │ │ ...
│ │ ├── 10--People_Marching
│ │ │ ...
```
- Download dataset manually:
To download the WIDER FACE dataset, run the following commands:
```
cd dataset/wider_face && ./download.sh
```
- Download dataset automatically:
If a training session is started but the dataset is not setup properly
(e.g, not found in dataset/wider_face), PaddleDetection can automatically
download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
automatically subsequently.
#### Data Augmentation
- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
according to the randomly selected face height and width, and judge the value of `v` in which interval of
`[16,32,64,128]`. Assuming `v=45` && `32<v<64`, and any value of `[16,32,64]` is selected with a probability
of uniform distribution. If `64` is selected, the face's interval is selected in `[64 / 2, min(v * 2, 64 * 2)]`.
- **Other methods:** Including `RandomDistort`,`ExpandImage`,`RandomInterpImage`,`RandomFlipImage` etc.
Please refer to [DATA.md](../../docs/DATA.md#APIs) for details.
### Training and Inference
`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
**NOTES:**
- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32)
and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters
in the table of [calculation rules](../../docs/GETTING_STARTED.md#faq)).
- Currently we do not support evaluation in training.
### Evaluation
Currently we support evaluation on the `WIDER FACE` dataset and the` FDDB` dataset. First run `tools / face_eval.py`
to generate the evaluation result file, and then use matlab(WIDER FACE)
or OpenCV(FDDB) calculates specific evaluation indicators.
Among them, the optional arguments list for running `tools / face_eval.py` is as follows:
- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation.
Default is `False`, it will select `single-scale` evaluation.
#### Evaluate on the WIDER FACE
- Evaluate and generate results files:
```
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
-o weights=output/blazeface/model_final/ \
--eval_mode=widerface
```
After the evaluation is completed, the test result in txt format will be generated in `output/pred`.
- Download the official evaluation script to evaluate the AP metrics:
```
wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
unzip eval_tools.zip && rm -f eval_tools.zip
```
- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`:
```
# Modify the folder name where the result is stored.
pred_dir = './pred';
# Modify the name of the curve to be drawn
legend_name = 'Fluid-BlazeFace';
```
- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows:
```
matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
```
#### Evaluate on the FDDB
We provide a FDDB data set evaluation process (currently only supports Linux systems),
please refer to [FDDB official website](http://vis-www.cs.umass.edu/fddb/) for other specific details.
- 1)Download and install OpenCV:
Download OpenCV: go to [OpenCV library](https://opencv.org/releases/) to Manual download
Install OpenCV:Please refer to [Official OpenCV Installation Tutorial](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html)
to install through source code.
- 2)Download datasets, evaluation code, and formatted data:
```
./dataset/fddb/download.sh
```
- 3)Compile FDDB evaluation code:
Go to the `dataset/fddb/evaluation` directory and modify the contents of the MakeFile file as follows:
```
evaluate: $(OBJS)
$(CC) $(OBJS) -o $@ $(LIBS)
```
Modify the content in `common.hpp` to the following form:
```
#define __IMAGE_FORMAT__ ".jpg"
//#define __IMAGE_FORMAT__ ".ppm"
#define __CVLOADIMAGE_WORKING__
```
According to the `grep -r "CV_RGB"` command, find the code segment containing `CV_RGB`, change `CV_RGB` to `Scalar`,
and add `using namespace cv;` in cpp, then compile:
```
make clean && make
```
- 4)Start evaluation:
Modify the contents of the `dataset_dir` and` annotation` fields in the config file:
```
EvalReader:
...
dataset:
dataset_dir: dataset/fddb
annotation: FDDB-folds/fddb_annotFile.txt
...
```
Evaluate and generate results files:
```
python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
-o weights=output/blazeface/model_final/ \
--eval_mode=fddb
```
After the evaluation is completed, the test result in txt format will be generated in `output/pred/pred_fddb_res.txt`.
Generate ContROC and DiscROC data:
```
cd dataset/fddb/evaluation
./evaluate -a ./FDDB-folds/fddb_annotFile.txt \
-f 0 -i ./ -l ./FDDB-folds/filePath.txt -z .jpg \
-d {RESULT_FILE} \
-r {OUTPUT_DIR}
```
**NOTES:**
(1)`RESULT_FILE` is the FDDB prediction result file output by `tools/face_eval.py`;
(2)`OUTPUT_DIR` is the prefix of the FDDB evaluation output file,
which will generate two files `{OUTPUT_DIR}ContROC.txt``{OUTPUT_DIR}DiscROC.txt`;
(3)The interpretation of the argument can be performed by `./evaluate --help`.
## Algorithm Description
### BlazeFace
**Introduction:**
[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model.
It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed
of 200-1000+ FPS on flagship devices.
**Particularity:**
- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
- Replace the non-maximum suppression algorithm with a blending strategy that estimates the
regression parameters of a bounding box as a weighted mean between the overlapping predictions.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
- NAS: use `Neural Architecture Search` algorithm to optimized network structure,
less network layer and conv channel number than `Lite`.
### FaceBoxes
**Introduction:**
[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
both speed and accuracy. This paper is published by IJCB(2017).
**Particularity:**
- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640,
including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities
are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
- Use density prior box to improve detection accuracy.
**Edition information:**
- Original: Reference original paper reproduction.
- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU.
Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution.
The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.
## Contributing
Contributions are highly welcomed and we would really appreciate your feedback!!
# All rights `PaddleDetection` reserved
# References:
# @TechReport{fddbTech,
# author = {Vidit Jain and Erik Learned-Miller},
# title = {FDDB: A Benchmark for Face Detection in Unconstrained Settings},
# institution = {University of Massachusetts, Amherst},
# year = {2010},
# number = {UM-CS-2010-009}
# }
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"
# Download the data.
echo "Downloading..."
# external link to the Faces in the Wild data set and annotations file
#wget http://tamaraberg.com/faceDataset/originalPics.tar.gz
wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz
wget http://vis-www.cs.umass.edu/fddb/evaluation.tgz
# Extract the data.
echo "Extracting..."
tar -zxf originalPics.tar.gz
tar -zxf FDDB-folds.tgz
tar -zxf evaluation.tgz
# Generate full image path list and groundtruth in FDDB-folds:
cd FDDB-folds
cat `ls|grep -v"ellipse"` > filePath.txt && cat *ellipse* > fddb_annotFile.txt
cd ..
echo "------------- All done! --------------"
......@@ -34,6 +34,7 @@ class BlazeNet(object):
double_blaze_filters (list): number of filter for each double_blaze block
with_extra_blocks (bool): whether or not extra blocks should be added
lite_edition (bool): whether or not is blazeface-lite
use_5x5kernel (bool): whether or not filter size is 5x5 in depth-wise conv
"""
def __init__(
......@@ -42,13 +43,15 @@ class BlazeNet(object):
double_blaze_filters=[[48, 24, 96, 2], [96, 24, 96], [96, 24, 96],
[96, 24, 96, 2], [96, 24, 96], [96, 24, 96]],
with_extra_blocks=True,
lite_edition=False):
lite_edition=False,
use_5x5kernel=True):
super(BlazeNet, self).__init__()
self.blaze_filters = blaze_filters
self.double_blaze_filters = double_blaze_filters
self.with_extra_blocks = with_extra_blocks
self.lite_edition = lite_edition
self.use_5x5kernel = use_5x5kernel
def __call__(self, input):
if not self.lite_edition:
......@@ -67,13 +70,18 @@ class BlazeNet(object):
"blaze_filters {} not in [2, 3]"
if len(v) == 2:
conv = self.BlazeBlock(
conv, v[0], v[1], name='blaze_{}'.format(k))
conv,
v[0],
v[1],
use_5x5kernel=self.use_5x5kernel,
name='blaze_{}'.format(k))
elif len(v) == 3:
conv = self.BlazeBlock(
conv,
v[0],
v[1],
stride=v[2],
use_5x5kernel=self.use_5x5kernel,
name='blaze_{}'.format(k))
layers = []
......@@ -86,6 +94,7 @@ class BlazeNet(object):
v[0],
v[1],
double_channels=v[2],
use_5x5kernel=self.use_5x5kernel,
name='double_blaze_{}'.format(k))
elif len(v) == 4:
layers.append(conv)
......@@ -95,6 +104,7 @@ class BlazeNet(object):
v[1],
double_channels=v[2],
stride=v[3],
use_5x5kernel=self.use_5x5kernel,
name='double_blaze_{}'.format(k))
layers.append(conv)
......
......@@ -124,6 +124,8 @@ def get_shrink(height, width):
max_shrink = max_shrink - 0.4
elif max_shrink >= 5:
max_shrink = max_shrink - 0.5
elif max_shrink <= 0.1:
max_shrink = 0.1
shrink = max_shrink if max_shrink < 1 else 1
return shrink, max_shrink
......
......@@ -265,12 +265,6 @@ def main():
if __name__ == '__main__':
parser = ArgsParser()
parser.add_argument(
"-d",
"--dataset_dir",
default=None,
type=str,
help="Dataset path, same as DataFeed.dataset.dataset_dir")
parser.add_argument(
"-f",
"--output_eval",
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册