diff --git a/README.md b/README.md index 7d35d8d9194b3fb32b0086c3769a21119aafb70c..193097678446d18398e7c1c0cadba088f97eabe3 100644 --- a/README.md +++ b/README.md @@ -136,4 +136,3 @@ PaddleDetection的目的是为工业界和学术界提供丰富、易用的目 ## 如何贡献代码 我们非常欢迎你可以为PaddleDetection提供代码,也十分感谢你的反馈。 - ✗ | diff --git a/configs/face_detection/README.md b/configs/face_detection/README.md index a73cf78113f4650bbbef45d9e98a7574665eac93..a2154891bd32b6670eb2e042a7e2a47059a0453b 100644 --- a/configs/face_detection/README.md +++ b/configs/face_detection/README.md @@ -1,17 +1,99 @@ +[English](README_en.md) | 简体中文 # FaceDetection -The goal of FaceDetection is to provide efficient and high-speed face detection solutions, -including cutting-edge and classic models. +## 内容 +- [简介](#简介) +- [模型库与基线](#模型库与基线) +- [快速开始](#快速开始) + - [数据准备](#数据准备) + - [训练与推理](#训练与推理) + - [评估](#评估) +- [算法细节](#算法细节) +- [如何贡献代码](#如何贡献代码) + +## 简介 +FaceDetection的目标是提供高效、高速的人脸检测解决方案,包括最先进的模型和经典模型。
-## Data Pipline -We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training -and testing of the model, the official website gives detailed data introduction. -- WIDER Face data source: -Loads `wider_face` type dataset with directory structures like this: +## 模型库与基线 +下表中展示了PaddleDetection当前支持的网络结构,具体细节请参考[算法细节](#算法细节)。 + +| | 原始版本 | Lite版本 [1](#lite) | NAS版本 [2](#nas) | +|:------------------------:|:--------:|:--------------------------:|:------------------------:| +| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ | +| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x | + +[1] `Lite版本`表示减少网络层数和通道数。 +[2] `NA版本`表示使用 `神经网络搜索`方法来构建网络结构。 + +### 模型库 + +#### WIDER-FACE数据集上的mAP + +| 网络结构 | 类型 | 输入尺寸 | 图片个数/GPU | 学习率策略 | Easy Set | Medium Set | Hard Set | 下载 | +|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:| +| BlazeFace | 原始版本 | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) | +| BlazeFace | Lite版本 | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) | +| BlazeFace | NAS版本 | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) | +| FaceBoxes | 原始版本 | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) | +| FaceBoxes | Lite版本 | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [模型](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) | + +**注意:** +- 我们使用`tools/face_eval.py`中多尺度评估策略得到`Easy/Medium/Hard Set`里的mAP。具体细节请参考[在WIDER-FACE数据集上评估](#在WIDER-FACE数据集上评估)。 +- BlazeFace-Lite的训练与测试使用 [blazeface.yml](../../configs/face_detection/blazeface.yml)配置文件并且设置:`lite_edition: true`。 + +#### FDDB数据集上的mAP + +| 网络结构 | Type | Size | DistROC | ContROC | +|:------------:|:--------:|:----:|:-------:|:-------:| +| BlazeFace | 原始版本 | 640 | **0.992** | **0.762** | +| BlazeFace | Lite版本 | 640 | 0.990 | 0.756 | +| BlazeFace | NAS版本 | 640 | 0.981 | 0.741 | +| FaceBoxes | 原始版本 | 640 | 0.987 | 0.736 | +| FaceBoxes | Lite版本 | 640 | 0.988 | 0.751 | + +**注意:** +- 我们在FDDB数据集上使用多尺度测试的方法得到mAP,具体细节请参考[在FDDB数据集上评估](#在FDDB数据集上评估)。 + +#### 推理时间和模型大小比较 + +| 网络结构 | 类型 | 输入尺寸 | P4(trt32) (ms) | CPU (ms) | 高通骁龙855(armv8) (ms) | 模型大小(MB) | +|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:| +| BlazeFace | 原始版本 | 128 | 1.387 | 23.461 | 6.036 | 0.777 | +| BlazeFace | Lite版本 | 128 | 1.323 | 12.802 | 6.193 | 0.68 | +| BlazeFace | NAS版本 | 128 | 1.03 | 6.714 | 2.7152 | 0.234 | +| FaceBoxes | 原始版本 | 128 | 3.144 | 14.972 | 19.2196 | 3.6 | +| FaceBoxes | Lite版本 | 128 | 2.295 | 11.276 | 8.5278 | 2 | +| BlazeFace | 原始版本 | 320 | 3.01 | 132.408 | 70.6916 | 0.777 | +| BlazeFace | Lite版本 | 320 | 2.535 | 69.964 | 69.9438 | 0.68 | +| BlazeFace | NAS版本 | 320 | 2.392 | 36.962 | 39.8086 | 0.234 | +| FaceBoxes | 原始版本 | 320 | 7.556 | 84.531 | 52.1022 | 3.6 | +| FaceBoxes | Lite版本 | 320 | 18.605 | 78.862 | 59.8996 | 2 | +| BlazeFace | 原始版本 | 640 | 8.885 | 519.364 | 149.896 | 0.777 | +| BlazeFace | Lite版本 | 640 | 6.988 | 284.13 | 149.902 | 0.68 | +| BlazeFace | NAS版本 | 640 | 7.448 | 142.91 | 69.8266 | 0.234 | +| FaceBoxes | 原始版本 | 640 | 78.201 | 394.043 | 169.877 | 3.6 | +| FaceBoxes | Lite版本 | 640 | 59.47 | 313.683 | 139.918 | 2 | + + +**注意:** +- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz。 +- P4(trt32)和CPU的推理时间测试基于PaddlePaddle-1.6.1版本。 +- ARM测试环境: + - 高通骁龙855(armv8); + - 单线程; + - Paddle-Lite 2.0.0版本。 + + +## 快速开始 + +### 数据准备 +我们使用[WIDER-FACE数据集](http://shuoyang1213.me/WIDERFACE/)进行训练和模型测试,官方网站提供了详细的数据介绍。 +- WIDER-Face数据源: +使用如下目录结构加载`wider_face`类型的数据集: ``` dataset/wider_face/ @@ -36,228 +118,159 @@ Loads `wider_face` type dataset with directory structures like this: │ │ │ ... ``` -- Download dataset manually: -To download the WIDER FACE dataset, run the following commands: +- 手动下载数据集: +要下载WIDER-FACE数据集,请运行以下命令: ``` cd dataset/wider_face && ./download.sh ``` -- Download dataset automatically: -If a training session is started but the dataset is not setup properly -(e.g, not found in dataset/wider_face), PaddleDetection can automatically -download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/), -the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered -automatically subsequently. +- 自动下载数据集: +如果已经开始训练,但是数据集路径设置不正确或找不到路径, PaddleDetection会从[WIDER-FACE数据集](http://shuoyang1213.me/WIDERFACE/)自动下载它们, +下载后解压的数据集将缓存在`~/.cache/paddle/dataset/`中,并且之后的训练测试会自动加载它们。 -### Data Augmentation +#### 数据增强方法 -- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales, -greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$ -according to the randomly selected face height and width, and judge the value of `v` in which interval of - `[16,32,64,128]`. Assuming `v=45` && `32[1](#lite) | NAS [2](#nas) | -|:------------------------:|:--------:|:--------------------------:|:------------------------:| -| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ | -| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x | - -[1] `Lite` edition means reduces the number of network layers and channels. -[2] `NAS` edition means use `Neural Architecture Search` algorithm to -optimized network structure. - -**Todo List:** -- [ ] HamBox -- [ ] Pyramidbox +### 评估 +目前我们支持在`WIDER FACE`数据集和`FDDB`数据集上评估。首先运行`tools/face_eval.py`生成评估结果文件,其次使用matlab(WIDER FACE) +或OpenCV(FDDB)计算具体的评估指标。 +其中,运行`tools/face_eval.py`的参数列表如下: +- `-f` 或者 `--output_eval`: 评估生成的结果文件保存路径,默认是: `output/pred`; +- `-e` 或者 `--eval_mode`: 评估模式,包括 `widerface` 和 `fddb`,默认是`widerface`; +- `--multi_scale`: 如果在命令中添加此操作按钮,它将选择多尺度评估。默认值为`False`,它将选择单尺度评估。 -### Model Zoo - -#### mAP in WIDER FACE - -| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | Download | -|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:| -| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) | -| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) | -| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) | -| FaceBoxes | Original | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) | -| FaceBoxes | Lite | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) | - -**NOTES:** -- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`. -For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE). -- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml) -configs file and set `lite_edition: true`. - -#### mAP in FDDB - -| Architecture | Type | Size | DistROC | ContROC | -|:------------:|:--------:|:----:|:-------:|:-------:| -| BlazeFace | Original | 640 | **0.992** | **0.762** | -| BlazeFace | Lite | 640 | 0.990 | 0.756 | -| BlazeFace | NAS | 640 | 0.981 | 0.741 | -| FaceBoxes | Original | 640 | 0.987 | 0.736 | -| FaceBoxes | Lite | 640 | 0.988 | 0.751 | - -**NOTES:** -- Get mAP by multi-scale evaluation on the FDDB dataset. -For details can refer to [Evaluation](#Evaluate-on-the-FDDB). - -#### Infer Time and Model Size comparison - -| Architecture | Type | Size | P4(trt32) (ms) | CPU (ms) | Qualcomm SnapDragon 855(armv8) (ms) | Model size (MB) | -|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:| -| BlazeFace | Original | 128 | 1.387 | 23.461 | 6.036 | 0.777 | -| BlazeFace | Lite | 128 | 1.323 | 12.802 | 6.193 | 0.68 | -| BlazeFace | NAS | 128 | 1.03 | 6.714 | 2.7152 | 0.234 | -| FaceBoxes | Original | 128 | 3.144 | 14.972 | 19.2196 | 3.6 | -| FaceBoxes | Lite | 128 | 2.295 | 11.276 | 8.5278 | 2 | -| BlazeFace | Original | 320 | 3.01 | 132.408 | 70.6916 | 0.777 | -| BlazeFace | Lite | 320 | 2.535 | 69.964 | 69.9438 | 0.68 | -| BlazeFace | NAS | 320 | 2.392 | 36.962 | 39.8086 | 0.234 | -| FaceBoxes | Original | 320 | 7.556 | 84.531 | 52.1022 | 3.6 | -| FaceBoxes | Lite | 320 | 18.605 | 78.862 | 59.8996 | 2 | -| BlazeFace | Original | 640 | 8.885 | 519.364 | 149.896 | 0.777 | -| BlazeFace | Lite | 640 | 6.988 | 284.13 | 149.902 | 0.68 | -| BlazeFace | NAS | 640 | 7.448 | 142.91 | 69.8266 | 0.234 | -| FaceBoxes | Original | 640 | 78.201 | 394.043 | 169.877 | 3.6 | -| FaceBoxes | Lite | 640 | 59.47 | 313.683 | 139.918 | 2 | - - -**NOTES:** -- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz -- P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.6.1 -- ARM test environment: - - Qualcomm SnapDragon 855(armv8) - - Single thread - - Paddle-Lite version 2.0.0 - - -## Get Started -`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md) -- **NOTES:** -- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32) -and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters -in the table of [calculation rules](../../docs/GETTING_STARTED.md#faq)) -- Currently we do not support evaluation in training. - -### Evaluation +#### 在WIDER-FACE数据集上评估 +评估并生成结果文件: ``` export CUDA_VISIBLE_DEVICES=0 export PYTHONPATH=$PYTHONPATH:. -python tools/face_eval.py -c configs/face_detection/blazeface.yml +python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \ + -o weights=output/blazeface/model_final/ \ + --eval_mode=widerface ``` -- Optional arguments -- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`. -- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`. -- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`. -- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation. -Default is `False`, it will select `single-scale` evaluation. - -After the evaluation is completed, the test result in txt format will be generated in `output/pred`, -and then mAP will be calculated according to different data sets. If you set `--eval_mode=widerface`, -it will [Evaluate on the WIDER FACE](#Evaluate-on-the-WIDER-FACE).If you set `--eval_mode=fddb`, -it will [Evaluate on the FDDB](#Evaluate-on-the-FDDB). - -#### Evaluate on the WIDER FACE -- Download the official evaluation script to evaluate the AP metrics: +评估完成后,将在`output/pred`中生成txt格式的测试结果。 + +- 下载官方评估脚本来评估AP指标: ``` wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip unzip eval_tools.zip && rm -f eval_tools.zip ``` -- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`: +- 在`eval_tools/wider_eval.m`中修改保存结果路径和绘制曲线的名称: ``` # Modify the folder name where the result is stored. pred_dir = './pred'; # Modify the name of the curve to be drawn legend_name = 'Fluid-BlazeFace'; ``` -- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows: +- `wider_eval.m` 是评估模块的主要执行程序。运行命令如下: ``` matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;" ``` -#### Evaluate on the FDDB -[FDDB dataset](http://vis-www.cs.umass.edu/fddb/) details can refer to FDDB's official website. -- Download the official dataset and evaluation script to evaluate the ROC metrics: +#### 在FDDB数据集上评估 +我们提供了一套FDDB数据集的评估流程(目前仅支持Linux系统),其他具体细节请参考[FDDB官网](http://vis-www.cs.umass.edu/fddb/)。 + +- 1)下载安装opencv: +下载OpenCV: 进入[OpenCV library](https://opencv.org/releases/)手动下载 +安装OpenCV:请参考[OpenCV官方安装教程](https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html)通过源码安装。 + +- 2)下载数据集、评估代码以及格式化数据: ``` -#external link to the Faces in the Wild data set -wget http://tamaraberg.com/faceDataset/originalPics.tar.gz -#The annotations are split into ten folds. See README for details. -wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz -#information on directory structure and file formats -wget http://vis-www.cs.umass.edu/fddb/README.txt +./dataset/fddb/download.sh ``` -- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/) -If the utility 'pkg-config' is not available for your operating system, -edit the Makefile to manually specify the OpenCV flags as following: + +- 3)编译FDDB评估代码: +进入`dataset/fddb/evaluation`目录下,修改MakeFile文件中内容如下: ``` -INCS = -I/usr/local/include/opencv -LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml +evaluate: $(OBJS) + $(CC) $(OBJS) -o $@ $(LIBS) +``` +修改`common.hpp`中内容为如下形式: +``` +#define __IMAGE_FORMAT__ ".jpg" +//#define __IMAGE_FORMAT__ ".ppm" +#define __CVLOADIMAGE_WORKING__ +``` +根据`grep -r "CV_RGB"`命令找到含有`CV_RGB`的代码段,将`CV_RGB`改成`Scalar`,并且在cpp中加入`using namespace cv;`, +然后编译: +``` +make clean && make ``` -- Compile FDDB evaluation code: execute `make` in evaluation folder. - -- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows: +- 4)开始评估: +修改config文件中`dataset_dir`和`annotation`字段内容: +``` +EvalReader: + ... + dataset: + dataset_dir: dataset/fddb + annotation: FDDB-folds/fddb_annotFile.txt + ... +``` +评估并生成结果文件: ``` -cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt` +python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \ + -o weights=output/blazeface/model_final/ \ + --eval_mode=fddb ``` -- Evaluation -Finally evaluation command is: +评估完成后,将在`output/pred/pred_fddb_res.txt`中生成txt格式的测试结果。 +生成ContROC与DiscROC数据: ``` -./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \ - -d DETECTION_RESULT.txt -f 0 \ - -i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \ - -r ./OUTPUT_DIR -z .jpg +cd dataset/fddb/evaluation +./evaluate -a ./FDDB-folds/fddb_annotFile.txt \ + -f 0 -i ./ -l ./FDDB-folds/filePath.txt -z .jpg \ + -d {RESULT_FILE} \ + -r {OUTPUT_DIR} ``` -**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`. +**注意:** +(1)`RESULT_FILE`是`tools/face_eval.py`输出的FDDB预测结果文件; +(2)`OUTPUT_DIR`是FDDB评估输出结果文件前缀,会生成两个文件`{OUTPUT_DIR}ContROC.txt`、`{OUTPUT_DIR}DiscROC.txt`; +(3)参数用法及注释可通过执行`./evaluate --help`来获取。 -## Algorithm Description +## 算法细节 ### BlazeFace -**Introduction:** -[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model. -It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed -of 200-1000+ FPS on flagship devices. - -**Particularity:** -- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution. -- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers. -- Replace the non-maximum suppression algorithm with a blending strategy that estimates the -regression parameters of a bounding box as a weighted mean between the overlapping predictions. - -**Edition information:** -- Original: Reference original paper reproduction. -- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels. -- NAS: use `Neural Architecture Search` algorithm to optimized network structure, -less network layer and conv channel number than `Lite`. +**简介:** +[BlazeFace](https://arxiv.org/abs/1907.05047) 是Google Research发布的人脸检测模型。它轻巧并且性能良好, +专为移动GPU推理量身定制。在旗舰设备上,速度可达到200-1000+FPS。 + +**特点:** +- 锚点策略在8×8(输入128x128)的特征图上停止,在该分辨率下每个像素点6个锚点; +- 5个单BlazeBlock和6个双BlazeBlock:5×5 depthwise卷积,可以保证在相同精度下网络层数更少; +- 用混合策略替换非极大值抑制算法,该策略将边界框的回归参数估计为重叠预测之间的加权平均值。 + +**版本信息:** +- 原始版本: 参考原始论文复现; +- Lite版本: 使用3x3卷积替换5x5卷积,更少的网络层数和通道数; +- NAS版本: 使用神经网络搜索算法构建网络结构,相比于`Lite`版本,NAS版本需要更少的网络层数和通道数。 ### FaceBoxes -**Introduction:** -[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector -with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on -both speed and accuracy. This paper is published by IJCB(2017). - -**Particularity:** -- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640, -including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities -are 1, 2, 4(20x20), 4(10x10) and 4(5x5). -- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU. -- Use density prior box to improve detection accuracy. - -**Edition information:** -- Original: Reference original paper reproduction. -- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU. -Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution. -The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite. - - -## Contributing -Contributions are highly welcomed and we would really appreciate your feedback!! +**简介:** +[FaceBoxes](https://arxiv.org/abs/1708.05234) 由Shifeng Zhang等人提出的高速和高准确率的人脸检测器, +被称为“高精度CPU实时人脸检测器”。 该论文收录于IJCB(2017)。 + +**特点:** +- 锚点策略分别在20x20、10x10、5x5(输入640x640)执行,每个像素点分别是3、1、1个锚点,对应密度系数是`1, 2, 4`(20x20)、4(10x10)、4(5x5); +- 在基础网络中个别block中使用CReLU和inception的结构; +- 使用密度先验盒(density_prior_box)可提高检测精度。 + +**版本信息:** +- 原始版本: 参考原始论文复现; +- Lite版本: 使用更少的网络层数和通道数,具体可参考[代码](../../ppdet/modeling/architectures/faceboxes.py)。 + + +## 如何贡献代码 +我们非常欢迎您可以为PaddleDetection中的人脸检测模型提供代码,您可以提交PR供我们review;也十分感谢您的反馈,可以提交相应issue,我们会及时解答。 diff --git a/configs/face_detection/README_en.md b/configs/face_detection/README_en.md new file mode 100644 index 0000000000000000000000000000000000000000..43020612659e600c7b607e5dd22ff4b14191b7e2 --- /dev/null +++ b/configs/face_detection/README_en.md @@ -0,0 +1,302 @@ +English | [简体中文](README.md) +# FaceDetection + +## Table of Contents +- [Introduction](#Introduction) +- [Benchmark and Model Zoo](#Benchmark-and-Model-Zoo) +- [Quick Start](#Quick-Start) + - [Data Pipline](#Data-Pipline) + - [Training and Inference](#Training-and-Inference) + - [Evaluation](#Evaluation) +- [Algorithm Description](#Algorithm-Description) +- [Contributing](#Contributing) + +## Introduction +The goal of FaceDetection is to provide efficient and high-speed face detection solutions, +including cutting-edge and classic models. + +
+ +
+ +## Benchmark and Model Zoo +PaddleDetection Supported architectures is shown in the below table, please refer to +[Algorithm Description](#Algorithm-Description) for details of the algorithm. + +| | Original | Lite [1](#lite) | NAS [2](#nas) | +|:------------------------:|:--------:|:--------------------------:|:------------------------:| +| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ | +| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x | + +[1] `Lite` edition means reduces the number of network layers and channels. +[2] `NAS` edition means use `Neural Architecture Search` algorithm to +optimized network structure. + + +### Model Zoo + +#### mAP in WIDER FACE + +| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | Download | +|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:| +| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) | +| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) | +| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) | +| FaceBoxes | Original | 640 | 8 | 32w | 0.878 | 0.851 | 0.576 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) | +| FaceBoxes | Lite | 640 | 8 | 32w | 0.901 | 0.875 | 0.760 | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) | + +**NOTES:** +- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`. +For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE). +- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml) +configs file and set `lite_edition: true`. + +#### mAP in FDDB + +| Architecture | Type | Size | DistROC | ContROC | +|:------------:|:--------:|:----:|:-------:|:-------:| +| BlazeFace | Original | 640 | **0.992** | **0.762** | +| BlazeFace | Lite | 640 | 0.990 | 0.756 | +| BlazeFace | NAS | 640 | 0.981 | 0.741 | +| FaceBoxes | Original | 640 | 0.987 | 0.736 | +| FaceBoxes | Lite | 640 | 0.988 | 0.751 | + +**NOTES:** +- Get mAP by multi-scale evaluation on the FDDB dataset. +For details can refer to [Evaluation](#Evaluate-on-the-FDDB). + +#### Infer Time and Model Size comparison + +| Architecture | Type | Size | P4(trt32) (ms) | CPU (ms) | Qualcomm SnapDragon 855(armv8) (ms) | Model size (MB) | +|:------------:|:--------:|:----:|:--------------:|:--------:|:-------------------------------------:|:---------------:| +| BlazeFace | Original | 128 | 1.387 | 23.461 | 6.036 | 0.777 | +| BlazeFace | Lite | 128 | 1.323 | 12.802 | 6.193 | 0.68 | +| BlazeFace | NAS | 128 | 1.03 | 6.714 | 2.7152 | 0.234 | +| FaceBoxes | Original | 128 | 3.144 | 14.972 | 19.2196 | 3.6 | +| FaceBoxes | Lite | 128 | 2.295 | 11.276 | 8.5278 | 2 | +| BlazeFace | Original | 320 | 3.01 | 132.408 | 70.6916 | 0.777 | +| BlazeFace | Lite | 320 | 2.535 | 69.964 | 69.9438 | 0.68 | +| BlazeFace | NAS | 320 | 2.392 | 36.962 | 39.8086 | 0.234 | +| FaceBoxes | Original | 320 | 7.556 | 84.531 | 52.1022 | 3.6 | +| FaceBoxes | Lite | 320 | 18.605 | 78.862 | 59.8996 | 2 | +| BlazeFace | Original | 640 | 8.885 | 519.364 | 149.896 | 0.777 | +| BlazeFace | Lite | 640 | 6.988 | 284.13 | 149.902 | 0.68 | +| BlazeFace | NAS | 640 | 7.448 | 142.91 | 69.8266 | 0.234 | +| FaceBoxes | Original | 640 | 78.201 | 394.043 | 169.877 | 3.6 | +| FaceBoxes | Lite | 640 | 59.47 | 313.683 | 139.918 | 2 | + + +**NOTES:** +- CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz. +- P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.6.1. +- ARM test environment: + - Qualcomm SnapDragon 855(armv8); + - Single thread; + - Paddle-Lite version 2.0.0. + +## Quick Start + +### Data Pipline +We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training +and testing of the model, the official website gives detailed data introduction. +- WIDER Face data source: +Loads `wider_face` type dataset with directory structures like this: + + ``` + dataset/wider_face/ + ├── wider_face_split + │ ├── wider_face_train_bbx_gt.txt + │ ├── wider_face_val_bbx_gt.txt + ├── WIDER_train + │ ├── images + │ │ ├── 0--Parade + │ │ │ ├── 0_Parade_marchingband_1_100.jpg + │ │ │ ├── 0_Parade_marchingband_1_381.jpg + │ │ │ │ ... + │ │ ├── 10--People_Marching + │ │ │ ... + ├── WIDER_val + │ ├── images + │ │ ├── 0--Parade + │ │ │ ├── 0_Parade_marchingband_1_1004.jpg + │ │ │ ├── 0_Parade_marchingband_1_1045.jpg + │ │ │ │ ... + │ │ ├── 10--People_Marching + │ │ │ ... + ``` + +- Download dataset manually: +To download the WIDER FACE dataset, run the following commands: +``` +cd dataset/wider_face && ./download.sh +``` + +- Download dataset automatically: +If a training session is started but the dataset is not setup properly +(e.g, not found in dataset/wider_face), PaddleDetection can automatically +download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/), +the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered +automatically subsequently. + +#### Data Augmentation + +- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales, +greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$ +according to the randomly selected face height and width, and judge the value of `v` in which interval of + `[16,32,64,128]`. Assuming `v=45` && `32 filePath.txt && cat *ellipse* > fddb_annotFile.txt +cd .. +echo "------------- All done! --------------" diff --git a/ppdet/modeling/backbones/blazenet.py b/ppdet/modeling/backbones/blazenet.py index 54c3f7e262464661f39fb73a9c5c70eabe4955c9..8b4f63db3ab8ae4e1e22a23fde70babb0f9166d9 100644 --- a/ppdet/modeling/backbones/blazenet.py +++ b/ppdet/modeling/backbones/blazenet.py @@ -34,6 +34,7 @@ class BlazeNet(object): double_blaze_filters (list): number of filter for each double_blaze block with_extra_blocks (bool): whether or not extra blocks should be added lite_edition (bool): whether or not is blazeface-lite + use_5x5kernel (bool): whether or not filter size is 5x5 in depth-wise conv """ def __init__( @@ -42,13 +43,15 @@ class BlazeNet(object): double_blaze_filters=[[48, 24, 96, 2], [96, 24, 96], [96, 24, 96], [96, 24, 96, 2], [96, 24, 96], [96, 24, 96]], with_extra_blocks=True, - lite_edition=False): + lite_edition=False, + use_5x5kernel=True): super(BlazeNet, self).__init__() self.blaze_filters = blaze_filters self.double_blaze_filters = double_blaze_filters self.with_extra_blocks = with_extra_blocks self.lite_edition = lite_edition + self.use_5x5kernel = use_5x5kernel def __call__(self, input): if not self.lite_edition: @@ -67,13 +70,18 @@ class BlazeNet(object): "blaze_filters {} not in [2, 3]" if len(v) == 2: conv = self.BlazeBlock( - conv, v[0], v[1], name='blaze_{}'.format(k)) + conv, + v[0], + v[1], + use_5x5kernel=self.use_5x5kernel, + name='blaze_{}'.format(k)) elif len(v) == 3: conv = self.BlazeBlock( conv, v[0], v[1], stride=v[2], + use_5x5kernel=self.use_5x5kernel, name='blaze_{}'.format(k)) layers = [] @@ -86,6 +94,7 @@ class BlazeNet(object): v[0], v[1], double_channels=v[2], + use_5x5kernel=self.use_5x5kernel, name='double_blaze_{}'.format(k)) elif len(v) == 4: layers.append(conv) @@ -95,6 +104,7 @@ class BlazeNet(object): v[1], double_channels=v[2], stride=v[3], + use_5x5kernel=self.use_5x5kernel, name='double_blaze_{}'.format(k)) layers.append(conv) diff --git a/ppdet/utils/widerface_eval_utils.py b/ppdet/utils/widerface_eval_utils.py index 6a7d254180b0c8507fd75bec5d36ec795e9aaae4..7f35c2431076cd654114539b290663e9dccbd950 100644 --- a/ppdet/utils/widerface_eval_utils.py +++ b/ppdet/utils/widerface_eval_utils.py @@ -124,6 +124,8 @@ def get_shrink(height, width): max_shrink = max_shrink - 0.4 elif max_shrink >= 5: max_shrink = max_shrink - 0.5 + elif max_shrink <= 0.1: + max_shrink = 0.1 shrink = max_shrink if max_shrink < 1 else 1 return shrink, max_shrink diff --git a/tools/face_eval.py b/tools/face_eval.py index 7f66021e95026482ba29cd59dbc1f41705ab1fa9..52a4557a819b868771fd45cf771908d86a30a696 100644 --- a/tools/face_eval.py +++ b/tools/face_eval.py @@ -265,12 +265,6 @@ def main(): if __name__ == '__main__': parser = ArgsParser() - parser.add_argument( - "-d", - "--dataset_dir", - default=None, - type=str, - help="Dataset path, same as DataFeed.dataset.dataset_dir") parser.add_argument( "-f", "--output_eval",