README.md 6.4 KB
Newer Older
D
dangqingqing 已提交
1
## SSD Object Detection
D
dangqingqing 已提交
2

X
Xingyuan Bu 已提交
3 4 5 6 7 8 9 10
## Table of Contents
- [Introduction](#introduction)
- [Data Preparation](#data-preparation)
- [Train](#train)
- [Evaluate](#evaluate)
- [Infer and Visualize](#infer-and-visualize)
- [Released Model](#released-model)

D
dangqingqing 已提交
11 12
### Introduction

X
Xingyuan Bu 已提交
13 14 15 16 17 18 19 20
[Single Shot MultiBox Detector (SSD)](https://arxiv.org/abs/1512.02325) framework for object detection can be categorized as a single stage detector. A single stage detector simplifies object detection as a regression problem, which directly predicts the bounding boxes and class probabilities without region proposal. SSD further makes improves by producing these predictions of different scales from different layers, as shown below. Six levels predictions are made in six different scale feature maps. And there are two 3x3 convolutional layers in each feature map, which predict category or a shape offset relative to the prior box(also called anchor), respectively. Thus, we get 38x38x4 + 19x19x6 + 10x10x6 + 5x5x6 + 3x3x4 + 1x1x4 = 8732 detections per class.
<p align="center">
<img src="images/SSD_paper_figure.jpg" height=300 width=900 hspace='10'/> <br />
The Single Shot MultiBox Detector (SSD)
</p>

SSD is readily pluggable into a wide variant standard convolutional network, such as VGG, ResNet, or MobileNet, which is also called base network or backbone. In this tutorial we used [MobileNet](https://arxiv.org/abs/1704.04861).

D
dangqingqing 已提交
21 22 23 24 25

### Data Preparation

You can use [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) or [MS-COCO dataset](http://cocodataset.org/#download).

X
Xingyuan Bu 已提交
26
If you want to train a model on PASCAL VOC dataset, please download dataset at first, skip this step if you already have one.
D
dangqingqing 已提交
27 28

```bash
D
dangqingqing 已提交
29 30
cd data/pascalvoc
./download.sh
D
dangqingqing 已提交
31
```
D
dangqingqing 已提交
32 33 34

The command `download.sh` also will create training and testing file lists.

X
Xingyuan Bu 已提交
35
If you want to train a model on MS-COCO dataset, please download dataset at first, skip this step if you already have one.
D
dangqingqing 已提交
36

D
dangqingqing 已提交
37
```
D
dangqingqing 已提交
38 39
cd data/coco
./download.sh
D
dangqingqing 已提交
40
```
D
dangqingqing 已提交
41

D
dangqingqing 已提交
42 43
### Train

D
dangqingqing 已提交
44
#### Download the Pre-trained Model.
D
dangqingqing 已提交
45

X
Xingyuan Bu 已提交
46
We provide two pre-trained models. The one is MobileNet-v1 SSD trained on COCO dataset, but removed the convolutional predictors for COCO dataset. This model can be used to initialize the models when training other datasets, like PASCAL VOC. The other pre-trained model is MobileNet-v1 trained on ImageNet 2012 dataset but removed the last weights and bias in the Fully-Connected layer.
D
dangqingqing 已提交
47

X
Xingyuan Bu 已提交
48 49
Declaration: the MobileNet-v1 SSD model is converted by [TensorFlow model](https://github.com/tensorflow/models/blob/f87a58cd96d45de73c9a8330a06b2ab56749a7fa/research/object_detection/g3doc/detection_model_zoo.md). The MobileNet-v1 model is converted from [Caffe](https://github.com/shicai/MobileNet-Caffe).
We will release the pre-trained models by ourself in the upcoming soon.
D
dangqingqing 已提交
50

D
dangqingqing 已提交
51
  - Download MobileNet-v1 SSD:
X
Xingyuan Bu 已提交
52
    ```bash
D
dangqingqing 已提交
53 54 55
    ./pretrained/download_coco.sh
    ```
  - Download MobileNet-v1:
X
Xingyuan Bu 已提交
56
    ```bash
D
dangqingqing 已提交
57 58 59 60 61
    ./pretrained/download_imagenet.sh
    ```

#### Train on PASCAL VOC

X
Xingyuan Bu 已提交
62 63 64
`train.py` is the main caller of the training module. Examples of usage are shown below.
  ```bash
  python -u train.py --batch_size=64 --dataset='pascalvoc' --pretrained_model='pretrained/ssd_mobilenet_v1_coco/'
D
dangqingqing 已提交
65
  ```
X
Xingyuan Bu 已提交
66 67 68
   - Set ```export CUDA_VISIBLE_DEVICES=0,1``` to specifiy the number of GPU you want to use.
   - Set ```--dataset='coco2014'``` or ```--dataset='coco2017'``` to train model on MS COCO dataset.
   - For more help on arguments:
D
dangqingqing 已提交
69

X
Xingyuan Bu 已提交
70 71
  ```bash
  python train.py --help
D
dangqingqing 已提交
72 73
  ```

X
Xingyuan Bu 已提交
74 75 76 77 78 79 80
Data reader is defined in `reader.py`. All images will be resized to 300x300. In training stage, images are randomly distorted, expanded, cropped and flipped:
   - distort: distort brightness, contrast, saturation, and hue.
   - expand: put the original image into a larger expanded image which is initialized using image mean.
   - crop: crop image with respect to different scale, aspect ratio, and overlap.
   - flip: flip horizontally.

We used RMSProp optimizer with mini-batch size 64 to train the MobileNet-SSD. The initial learning rate is 0.001, and was decayed at 40, 60, 80, 100 epochs with multiplier 0.5, 0.25, 0.1, 0.01, respectively. Weight decay is 0.00005. After 120 epochs we achieve 73.32% mAP under 11point metric.
D
dangqingqing 已提交
81 82 83

### Evaluate

X
Xingyuan Bu 已提交
84 85 86 87 88 89 90 91
You can evaluate your trained model in different metrics like 11point, integral on both PASCAL VOC and COCO dataset. Note we set the default test list to the dataset's test/val list, you can use your own test list by setting ```--test_list``` args.

`eval.py` is the main caller of the evaluating module. Examples of usage are shown below.
```bash
python eval.py --dataset='pascalvoc' --model_dir='train_pascal_model/best_model' --data_dir='data/pascalvoc' --test_list='test.txt' --ap_version='11point' --nms_threshold=0.45
```

You can set ```--dataset``` to ```coco2014``` or ```coco2017``` to evaluate COCO dataset. Moreover, we provide `eval_coco_map.py` which uses a COCO-specific mAP metric defined by [COCO committee](http://cocodataset.org/#detections-eval). To use this eval_coco_map.py, [cocoapi](https://github.com/cocodataset/cocoapi) is needed.
92 93 94 95 96 97 98 99 100 101 102
Install the cocoapi:
```
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python2 setup.py install --user
```
D
dangqingqing 已提交
103

D
dangqingqing 已提交
104
### Infer and Visualize
X
Xingyuan Bu 已提交
105 106 107
`infer.py` is the main caller of the inferring module. Examples of usage are shown below.
```bash
python infer.py --dataset='pascalvoc' --nms_threshold=0.45 --model_dir='train_pascal_model/best_model' --image_path='./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg'
D
dangqingqing 已提交
108
```
X
Xingyuan Bu 已提交
109
Below are the examples of running the inference and visualizing the model result.
110
<p align="center">
X
Xingyuan Bu 已提交
111 112 113 114 115
<img src="images/009943.jpg" height=300 width=400 hspace='10'/>
<img src="images/009956.jpg" height=300 width=400 hspace='10'/>
<img src="images/009960.jpg" height=300 width=400 hspace='10'/>
<img src="images/009962.jpg" height=300 width=400 hspace='10'/> <br />
MobileNet-v1-SSD 300x300 Visualization Examples
116
</p>
D
dangqingqing 已提交
117 118


D
dangqingqing 已提交
119 120 121
### Released Model


D
dangqingqing 已提交
122 123
| Model                    | Pre-trained Model  | Training data    | Test data    | mAP |
|:------------------------:|:------------------:|:----------------:|:------------:|:----:|
X
Xingyuan Bu 已提交
124
|[MobileNet-v1-SSD 300x300](http://paddlemodels.bj.bcebos.com/ssd_mobilenet_v1_pascalvoc.tar.gz) | COCO MobileNet SSD | VOC07+12 trainval| VOC07 test   | 73.32%  |