diff --git a/configs/face_detection/README.md b/configs/face_detection/README.md new file mode 100644 index 0000000000000000000000000000000000000000..55146ae9de20d71427b06e7db91a931a6bab19f1 --- /dev/null +++ b/configs/face_detection/README.md @@ -0,0 +1,252 @@ +English | [简体中文](README_cn.md) + +# FaceDetection +The goal of FaceDetection is to provide efficient and high-speed face detection solutions, +including cutting-edge and classic models. + + +
+ +
+ +## Data Pipline +We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training +and testing of the model, the official website gives detailed data introduction. +- WIDER Face data source: +Loads `wider_face` type dataset with directory structures like this: + + ``` + dataset/wider_face/ + ├── wider_face_split + │ ├── wider_face_train_bbx_gt.txt + │ ├── wider_face_val_bbx_gt.txt + ├── WIDER_train + │ ├── images + │ │ ├── 0--Parade + │ │ │ ├── 0_Parade_marchingband_1_100.jpg + │ │ │ ├── 0_Parade_marchingband_1_381.jpg + │ │ │ │ ... + │ │ ├── 10--People_Marching + │ │ │ ... + ├── WIDER_val + │ ├── images + │ │ ├── 0--Parade + │ │ │ ├── 0_Parade_marchingband_1_1004.jpg + │ │ │ ├── 0_Parade_marchingband_1_1045.jpg + │ │ │ │ ... + │ │ ├── 10--People_Marching + │ │ │ ... + ``` + +- Download dataset manually: +On the other hand, to download the WIDER FACE dataset, run the following commands: +``` +cd dataset/wider_face && ./download.sh +``` + +- Download dataset automatically: +If a training session is started but the dataset is not setup properly +(e.g, not found in dataset/wider_face), PaddleDetection can automatically +download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/), +the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered +automatically subsequently. + +### Data Augmentation + +- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales, +greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$ +according to the randomly selected face height and width, and judge the value of `v` in which interval of + `[16,32,64,128]`. Assuming `v=45` && `32[1](#lite) | NAS [2](#nas) | +|:------------------------:|:--------:|:--------------------------:|:------------------------:| +| [BlazeFace](#BlazeFace) | ✓ | ✓ | ✓ | +| [FaceBoxes](#FaceBoxes) | ✓ | ✓ | x | + +[1] `Lite` edition means reduces the number of network layers and channels. +[2] `NAS` edition means use `Neural Architecture Search` algorithm to +optimized network structure. + +**Todo List:** +- [ ] HamBox +- [ ] Pyramidbox + +### Model Zoo + +#### mAP in WIDER FACE + +| Architecture | Type | Size | Img/gpu | Lr schd | Easy Set | Medium Set | Hard Set | +|:------------:|:--------:|:----:|:-------:|:-------:|:--------:|:----------:|:--------:| +| BlazeFace | Original | 640 | 8 | 32w | **0.915** | **0.892** | **0.797** | +| BlazeFace | Lite | 640 | 8 | 32w | 0.909 | 0.885 | 0.781 | +| BlazeFace | NAS | 640 | 8 | 32w | 0.837 | 0.807 | 0.658 | +| FaceBoxes | Original | 640 | 8 | 32w | 0.875 | 0.848 | 0.568 | +| FaceBoxes | Lite | 640 | 8 | 32w | 0.898 | 0.872 | 0.752 | + +**NOTES:** +- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`. +For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE). +- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml) +configs file and set `lite_edition: true`. + +#### mAP in FDDB + +| Architecture | Type | Size | DistROC | ContROC | +|:------------:|:--------:|:----:|:-------:|:-------:| +| BlazeFace | Original | 640 | **0.992** | **0.762** | +| BlazeFace | Lite | 640 | 0.990 | 0.756 | +| BlazeFace | NAS | 640 | 0.981 | 0.741 | +| FaceBoxes | Original | 640 | 0.985 | 0.731 | +| FaceBoxes | Lite | 640 | 0.987 | 0.741 | + +**NOTES:** +- Get mAP by multi-scale evaluation on the FDDB dataset. +For details can refer to [Evaluation](#Evaluate-on-the-FDDB). + +#### Infer Time and Model Size comparison + +| Architecture | Type | Size | P4 (ms) | CPU (ms) | ARM (ms) | File size (MB) | Flops | +|:------------:|:--------:|:----:|:---------:|:--------:|:----------:|:--------------:|:---------:| +| BlazeFace | Original | 128 | - | - | - | - | - | +| BlazeFace | Lite | 128 | - | - | - | - | - | +| BlazeFace | NAS | 128 | - | - | - | - | - | +| FaceBoxes | Original | 128 | - | - | - | - | - | +| FaceBoxes | Lite | 128 | - | - | - | - | - | +| BlazeFace | Original | 320 | - | - | - | - | - | +| BlazeFace | Lite | 320 | - | - | - | - | - | +| BlazeFace | NAS | 320 | - | - | - | - | - | +| FaceBoxes | Original | 320 | - | - | - | - | - | +| FaceBoxes | Lite | 320 | - | - | - | - | - | +| BlazeFace | Original | 640 | - | - | - | - | - | +| BlazeFace | Lite | 640 | - | - | - | - | - | +| BlazeFace | NAS | 640 | - | - | - | - | - | +| FaceBoxes | Original | 640 | - | - | - | - | - | +| FaceBoxes | Lite | 640 | - | - | - | - | - | + + +**NOTES:** +- CPU: i5-7360U @ 2.30GHz. Single core and single thread. + + + +## Get Started +`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md) +- **NOTES:** Currently we do not support evaluation in training. + +### Evaluation +``` +export CUDA_VISIBLE_DEVICES=0 +export PYTHONPATH=$PYTHONPATH:. +python tools/face_eval.py -c configs/face_detection/blazeface.yml +``` +- Optional arguments +- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`. +- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`. +- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`. + +After the evaluation is completed, the test result in txt format will be generated in `output/pred`, +and then mAP will be calculated according to different data sets: + +#### Evaluate on the WIDER FACE +- Download the official evaluation script to evaluate the AP metrics: +``` +wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip +unzip eval_tools.zip && rm -f eval_tools.zip +``` +- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`: +``` +# Modify the folder name where the result is stored. +pred_dir = './pred'; +# Modify the name of the curve to be drawn +legend_name = 'Fluid-BlazeFace'; +``` +- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows: +``` +matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;" +``` + +#### Evaluate on the FDDB +- Download the official dataset and evaluation script to evaluate the ROC metrics: +``` +#external link to the Faces in the Wild data set +wget http://tamaraberg.com/faceDataset/originalPics.tar.gz +#The annotations are split into ten folds. See README for details. +wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz +#information on directory structure and file formats +wget http://vis-www.cs.umass.edu/fddb/README.txt +``` +- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/) +If the utility 'pkg-config' is not available for your operating system, +edit the Makefile to manually specify the OpenCV flags as following: +``` +INCS = -I/usr/local/include/opencv +LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml +``` + +- Compile FDDB evaluation code: execute `make` in evaluation folder. + +- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows: +``` +cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt` +``` +- Evaluation +Finally evaluation command is: +``` +./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \ + -d DETECTION_RESULT.txt -f 0 \ + -i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \ + -r ./OUTPUT_DIR -z .jpg +``` +**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`. + +## Algorithm Description + +### BlazeFace +**Introduction:** +[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model. +It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed +of 200-1000+ FPS on flagship devices. + +**Particularity:** +- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution. +- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers. +- Replace the non-maximum suppression algorithm with a blending strategy that estimates the +regression parameters of a bounding box as a weighted mean between the overlapping predictions. + +**Edition information:** +- Original: Reference original paper reproduction. +- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels. +- NAS: use `Neural Architecture Search` algorithm to optimized network structure, +less network layer and conv channel number than `Lite`. + +### FaceBoxes +**Introduction:** +[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector +with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on +both speed and accuracy. This paper is published by IJCB(2017). + +**Particularity:** +- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640, +including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities +are 1, 2, 4(20x20), 4(10x10) and 4(5x5). +- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU. +- Use density prior box to improve detection accuracy. + +**Edition information:** +- Original: Reference original paper reproduction. +- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU. +Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution. +The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite. + + +## Contributing +Contributions are highly welcomed and we would really appreciate your feedback!! diff --git a/configs/face_detection/blazeface.yml b/configs/face_detection/blazeface.yml index 8b27eae70ec3895da635a6a35a9eed94531aeba0..692f14a7cc8091bc8df1f5edbfbca2a9c59b0073 100644 --- a/configs/face_detection/blazeface.yml +++ b/configs/face_detection/blazeface.yml @@ -89,7 +89,7 @@ SSDEvalFeed: fields: ['image', 'im_id', 'gt_box'] dataset: dataset_dir: dataset/wider_face - annotation: annotFile.txt #wider_face_split/wider_face_val_bbx_gt.txt + annotation: wider_face_split/wider_face_val_bbx_gt.txt image_dir: WIDER_val/images drop_last: false image_shape: [3, 640, 640] diff --git a/dataset/wider_face/download.sh b/dataset/wider_face/download.sh new file mode 100755 index 0000000000000000000000000000000000000000..6c86a22c6826d88846a16fbd43f8b556d8610b8f --- /dev/null +++ b/dataset/wider_face/download.sh @@ -0,0 +1,21 @@ +# All rights `PaddleDetection` reserved +# References: +# @inproceedings{yang2016wider, +# Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou}, +# Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, +# Title = {WIDER FACE: A Face Detection Benchmark}, +# Year = {2016}} + +DIR="$( cd "$(dirname "$0")" ; pwd -P )" +cd "$DIR" + +# Download the data. +echo "Downloading..." +wget https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip +wget https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip +wget https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip +# Extract the data. +echo "Extracting..." +unzip WIDER_train.zip +unzip WIDER_val.zip +unzip wider_face_split.zip diff --git a/demo/output/12_Group_Group_12_Group_Group_12_935.jpg b/demo/output/12_Group_Group_12_Group_Group_12_935.jpg new file mode 100644 index 0000000000000000000000000000000000000000..2a563361ae03fbe079dba017374eee51ccbd17dd Binary files /dev/null and b/demo/output/12_Group_Group_12_Group_Group_12_935.jpg differ diff --git a/docs/DATA.md b/docs/DATA.md index c47049b0a7c59d3db83bfaf7f839d6fa99b8880d..0eb686474be7ece55034082b8962948ff7320a86 100644 --- a/docs/DATA.md +++ b/docs/DATA.md @@ -126,6 +126,8 @@ the corresponding data stream. Many aspect of the `Reader`, such as storage location, preprocessing pipeline, acceleration mode can be configured with yaml files. +### APIs + The main APIs are as follows: 1. Data parsing @@ -139,7 +141,7 @@ The main APIs are as follows: - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py) 2. Operator - `transform/operators.py`: Contains a variety of data enhancement methods, including: + `transform/operators.py`: Contains a variety of data augmentation methods, including: - `DecodeImage`: Read images in RGB format. - `RandomFlipImage`: Horizontal flip. - `RandomDistort`: Distort brightness, contrast, saturation, and hue. @@ -150,7 +152,7 @@ The main APIs are as follows: - `NormalizeImage`: Normalize image pixel values. - `NormalizeBox`: Normalize the bounding box. - `Permute`: Arrange the channels of the image and optionally convert image to BGR format. -- `MixupImage`: Mixup two images with given fraction[1](#vd). +- `MixupImage`: Mixup two images with given fraction[1](#mix). [1] Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf)。 diff --git a/docs/DATA_cn.md b/docs/DATA_cn.md index eff8b5489a2cdf9524473c563ce2d90ae9d9bd64..57970169711e5c6d605999f119ae488eeb52f96c 100644 --- a/docs/DATA_cn.md +++ b/docs/DATA_cn.md @@ -105,9 +105,9 @@ python ./ppdet/data/tools/generate_data_for_training.py 4. 数据获取接口 为方便训练时的数据获取,我们将多个`data.Dataset`组合在一起构成一个`data.Reader`为用户提供数据,用户只需要调用`Reader.[train|eval|infer]`即可获得对应的数据流。`Reader`支持yaml文件配置数据地址、预处理过程、加速方式等。 -主要的APIs如下: - +### APIs +主要的APIs如下: 1. 数据解析 diff --git a/ppdet/utils/download.py b/ppdet/utils/download.py index 05f62749192cf0546aabb181c5397e2551806fb2..473cf5ff8fb72a0b203241208171e769f6ba244e 100644 --- a/ppdet/utils/download.py +++ b/ppdet/utils/download.py @@ -60,6 +60,17 @@ DATASETS = { 'http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar', 'b6e924de25625d8de591ea690078ad9f', ), ], ["VOCdevkit/VOC_all"]), + 'wider_face': ([ + ( + 'https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip', + '3fedf70df600953d25982bcd13d91ba2', ), + ( + 'https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip', + 'dfa7d7e790efa35df3788964cf0bbaea', ), + ( + 'https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip', + 'a4a898d6193db4b9ef3260a68bad0dc7', ), + ], ["WIDER_train", "WIDER_val", "wider_face_split"]), 'fruit': ([ ( 'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar', @@ -114,7 +125,8 @@ def get_dataset_path(path, annotation, image_dir): # not match any dataset in DATASETS raise ValueError("Dataset {} is not valid and cannot parse dataset type " "'{}' for automaticly downloading, which only supports " - "'voc' and 'coco' currently".format(path, osp.split(path)[-1])) + "'voc' and 'coco' currently".format(path, + osp.split(path)[-1])) def _merge_voc_dir(data_dir, output_subdir): @@ -201,7 +213,7 @@ def _dataset_exists(path, annotation, image_dir): """ if not osp.exists(path): logger.info("Config dataset_dir {} is not exits, " - "dataset config is not valid".format(path)) + "dataset config is not valid".format(path)) return False if annotation: @@ -324,7 +336,7 @@ def _decompress(fname): def _move_and_merge_tree(src, dst): """ - Move src directory to dst, if dst is already exists, + Move src directory to dst, if dst is already exists, merge src to dst """ if not osp.exists(dst):