提交 843a00f7 编写于 作者: C Chris Yann 提交者: qingqing01

Image classification Networks Updating (#979)

* Add more model configure, such as AlexNet, VGG, GoogleNet, DPN, ResNet etc.
* Add download_imagenet2012.sh
* Update doc.
* Add Chinese doc.
上级 3ccb855b
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html). # Image Classification and Model Zoo
Image classification, which is an important field of computer vision, is to classify an image into pre-defined labels. Recently, many researchers developed different kinds of neural networks and highly improve the classification performance. This page introduces how to do image classification with PaddlePaddle Fluid, including [data preparation](#data-preparation), [training](#training-a-model), [finetuning](#finetuning), [evaluation](#evaluation) and [inference](#inference).
--- ---
## Table of Contents
- [Installation](#installation)
- [Data preparation](#data-preparation)
- [Training a model with flexible parameters](#training-a-model)
- [Finetuning](#finetuning)
- [Evaluation](#evaluation)
- [Inference](#inference)
- [Supported models and performances](#supported-models)
# SE-ResNeXt for image classification ## Installation
This model built with paddle fluid is still under active development and is not Running sample code in this directory requires PaddelPaddle Fluid v0.13.0 and later. If the PaddlePaddle on your device is lower than this version, please follow the instructions in [installation document](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html) and make an update.
the final version. We welcome feedbacks.
## Introduction ## Data preparation
The current code support the training of [SE-ResNeXt](https://arxiv.org/abs/1709.01507) (50/152 layers). An example for ImageNet classification is as follows. First of all, preparation of imagenet data can be done as:
```
cd data/ILSVRC2012/
sh download_imagenet2012.sh
```
## Data Preparation In the shell script ```download_imagenet2012.sh```, there are three steps to prepare data:
1. Download ImageNet-2012 dataset **step-1:** Register at ```image-net.org``` first in order to get a pair of ```Username``` and ```AccessKey```, which are used to download ImageNet data.
```
cd data/
mkdir -p ILSVRC2012/
cd ILSVRC2012/
# get training set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
# get validation set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
# prepare directory
tar xf ILSVRC2012_img_train.tar
tar xf ILSVRC2012_img_val.tar
# unzip all classes data using unzip.sh **step-2:** Download ImageNet-2012 dataset from website. The training and validation data will be downloaded into folder "train" and "val" respectively. Please note that the size of data is more than 40 GB, it will take much time to download. Users who have downloaded the ImageNet data can organize it into ```data/ILSVRC2012``` directly.
sh unzip.sh
```
2. Download training and validation label files from [ImageNet2012 url](https://pan.baidu.com/s/1Y6BCo0nmxsm_FsEqmx2hKQ)(password:```wx99```). Untar it into workspace ```ILSVRC2012/```. The files include **step-3:** Download training and validation label files. There are two label files which contain train and validation image labels respectively:
**train_list.txt**: training list of imagenet 2012 classification task, with each line seperated by SPACE. * *train_list.txt*: label file of imagenet-2012 training set, with each line seperated by ```SPACE```, like:
``` ```
train/n02483708/n02483708_2436.jpeg 369 train/n02483708/n02483708_2436.jpeg 369
train/n03998194/n03998194_7015.jpeg 741 train/n03998194/n03998194_7015.jpeg 741
...@@ -41,7 +40,7 @@ train/n04596742/n04596742_3032.jpeg 909 ...@@ -41,7 +40,7 @@ train/n04596742/n04596742_3032.jpeg 909
train/n03208938/n03208938_7065.jpeg 535 train/n03208938/n03208938_7065.jpeg 535
... ...
``` ```
**val_list.txt**: validation list of imagenet 2012 classification task, with each line seperated by SPACE. * *val_list.txt*: label file of imagenet-2012 validation set, with each line seperated by ```SPACE```, like.
``` ```
val/ILSVRC2012_val_00000001.jpeg 65 val/ILSVRC2012_val_00000001.jpeg 65
val/ILSVRC2012_val_00000002.jpeg 970 val/ILSVRC2012_val_00000002.jpeg 970
...@@ -50,38 +49,160 @@ val/ILSVRC2012_val_00000004.jpeg 809 ...@@ -50,38 +49,160 @@ val/ILSVRC2012_val_00000004.jpeg 809
val/ILSVRC2012_val_00000005.jpeg 516 val/ILSVRC2012_val_00000005.jpeg 516
... ...
``` ```
**synset_words.txt**: the semantic label of each class.
## Training a model ## Training a model with flexible parameters
To start a training task, one can use command line as: After data preparation, one can start the training step by:
``` ```
python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False python train.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=False \
--lr_strategy=piecewise_decay \
--lr=0.1
``` ```
## Finetune a model **parameter introduction:**
* **model**: name model to use. Default: "SE_ResNeXt50_32x4d".
* **num_epochs**: the number of epochs. Default: 120.
* **batch_size**: the size of each mini-batch. Default: 256.
* **use_gpu**: whether to use GPU or not. Default: True.
* **total_images**: total number of images in the training set. Default: 1281167.
* **class_dim**: the class number of the classification task. Default: 1000.
* **image_shape**: input size of the network. Default: "3,224,224".
* **model_save_dir**: the directory to save trained model. Default: "output".
* **with_mem_opt**: whether to use memory optimization or not. Default: False.
* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay".
* **lr**: initialized learning rate. Default: 0.1.
* **pretrained_model**: model path for pretraining. Default: None.
* **checkpoint**: the checkpoint path to resume. Default: None.
**data reader introduction:** Data reader is defined in ```reader.py```. In [training stage](#training-a-model), random crop and flipping are used, while center crop is used in [evaluation](#inference) and [inference](#inference) stages. Supported data augmentation includes:
* rotation
* color jitter
* random crop
* center crop
* resize
* flipping
**training curve:** The training curve can be drawn based on training log. For example, the log from training AlexNet is like:
``` ```
python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False --pretrained_model="pretrain/96/" End pass 1, train_loss 6.23153877258, train_acc1 0.0150696625933, train_acc5 0.0552518665791, test_loss 5.41981744766, test_acc1 0.0519132651389, test_acc5 0.156150355935
End pass 2, train_loss 5.15442800522, train_acc1 0.0784279331565, train_acc5 0.211050540209, test_loss 4.45795249939, test_acc1 0.140469551086, test_acc5 0.333163291216
End pass 3, train_loss 4.51505613327, train_acc1 0.145300447941, train_acc5 0.331567406654, test_loss 3.86548018456, test_acc1 0.219443559647, test_acc5 0.446448504925
End pass 4, train_loss 4.12735557556, train_acc1 0.19437250495, train_acc5 0.405713528395, test_loss 3.56990146637, test_acc1 0.264536827803, test_acc5 0.507190704346
End pass 5, train_loss 3.87505435944, train_acc1 0.229518383741, train_acc5 0.453582793474, test_loss 3.35345435143, test_acc1 0.297349333763, test_acc5 0.54753267765
End pass 6, train_loss 3.6929500103, train_acc1 0.255628824234, train_acc5 0.487188398838, test_loss 3.17112898827, test_acc1 0.326953113079, test_acc5 0.581780135632
End pass 7, train_loss 3.55882954597, train_acc1 0.275381118059, train_acc5 0.511990904808, test_loss 3.03736782074, test_acc1 0.349035382271, test_acc5 0.606293857098
End pass 8, train_loss 3.45595097542, train_acc1 0.291462600231, train_acc5 0.530815005302, test_loss 2.96034455299, test_acc1 0.362228929996, test_acc5 0.617390751839
End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545210540295, test_loss 2.93932366371, test_acc1 0.37129303813, test_acc5 0.623573005199
...
``` ```
TBD
## Inference The error rate curves of AlexNet, ResNet50 and SE-ResNeXt-50 are shown in the figure below.
<p align="center">
<img src="images/curve.jpg" height=480 width=640 hspace='10'/> <br />
Training and validation Curves
</p>
## Finetuning
Finetuning is to finetune model weights in a specific task by loading pretrained weights. After initializing ```path_to_pretrain_model```, one can finetune a model as:
``` ```
python infer.py --num_layers=50 --batch_size=8 --model='model/90' --test_list='' python train.py
--model=SE_ResNeXt50_32x4d \
--pretrained_model=${path_to_pretrain_model} \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000 \
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=True \
--lr_strategy=piecewise_decay \
--lr=0.1
``` ```
TBD
## Results ## Evaluation
Evaluation is to evaluate the performance of a trained model. One can download [pretrained models](#supported-models) and set its path to ```path_to_pretrain_model```. Then top1/top5 accuracy can be obtained by running the following command:
```
python eval.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
The SE-ResNeXt-50 model is trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each ```10``` epoches. Top-1/Top-5 Validation Accuracy on ImageNet 2012 is listed in table. According to the congfiguration of evaluation, the output log is like:
```
Testbatch 0,loss 2.1786134243, acc1 0.625,acc5 0.8125,time 0.48 sec
Testbatch 10,loss 0.898496925831, acc1 0.75,acc5 0.9375,time 0.51 sec
Testbatch 20,loss 1.32524681091, acc1 0.6875,acc5 0.9375,time 0.37 sec
Testbatch 30,loss 1.46830511093, acc1 0.5,acc5 0.9375,time 0.51 sec
Testbatch 40,loss 1.12802267075, acc1 0.625,acc5 0.9375,time 0.35 sec
Testbatch 50,loss 0.881597697735, acc1 0.8125,acc5 1.0,time 0.32 sec
Testbatch 60,loss 0.300163716078, acc1 0.875,acc5 1.0,time 0.48 sec
Testbatch 70,loss 0.692037761211, acc1 0.875,acc5 1.0,time 0.35 sec
Testbatch 80,loss 0.0969972759485, acc1 1.0,acc5 1.0,time 0.41 sec
...
```
|model | [original paper(Fig.5)](https://arxiv.org/abs/1709.01507) | Pytorch | Paddle fluid ## Inference
|- | :-: |:-: | -: Inference is used to get prediction score or image features based on trained models.
|SE-ResNeXt-50 | 77.6%/- | 77.71%/93.63% | 77.42%/93.50% ```
python infer.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
The output contains predication results, including maximum score (before softmax) and corresponding predicted label.
```
Test-0-score: [13.168352], class [491]
Test-1-score: [7.913302], class [975]
Test-2-score: [16.959702], class [21]
Test-3-score: [14.197695], class [383]
Test-4-score: [12.607652], class [878]
Test-5-score: [17.725458], class [15]
Test-6-score: [12.678599], class [118]
Test-7-score: [12.353498], class [505]
Test-8-score: [20.828007], class [747]
Test-9-score: [15.135801], class [315]
Test-10-score: [14.585114], class [920]
Test-11-score: [13.739927], class [679]
Test-12-score: [15.040644], class [386]
...
```
## Supported models and performances
Models are trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each pre-defined epoches, if not special introduced. Available top-1/top-5 validation accuracy on ImageNet 2012 are listed in table. Pretrained models can be downloaded by clicking related model names.
## Released models |model | top-1/top-5 accuracy
|model | Baidu Cloud
|- | -: |- | -:
|SE-ResNeXt-50 | [url]() |[AlexNet](http://paddle-imagenet-models.bj.bcebos.com/alexnet_model.tar) | 57.21%/79.72%
TBD |VGG11 | -
|VGG13 | -
|VGG16 | -
|VGG19 | -
|GoogleNet | -
|InceptionV4 | -
|MobileNet | -
|[ResNet50](http://paddle-imagenet-models.bj.bcebos.com/resnet_50_model.tar) | 76.63%/93.10%
|ResNet101 | -
|ResNet152 | -
|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.33%/93.96%
|SE_ResNeXt101_32x4d | -
|SE_ResNeXt152_32x4d | -
|DPN68 | -
|DPN92 | -
|DPN98 | -
|DPN107 | -
|DPN131 | -
# 图像分类以及模型库
图像分类是计算机视觉的重要领域,它的目标是将图像分类到预定义的标签。近期,需要研究者提出很多不同种类的神经网络,并且极大的提升了分类算法的性能。本页将介绍如何使用PaddlePaddle进行图像分类,包括[数据准备](#data-preparation)[训练](#training-a-model)[参数微调](#finetuning)[模型评估](#evaluation)以及[模型推断](#inference)
---
## 内容
- [安装](#installation)
- [数据准备](#data-preparation)
- [模型训练](#training-a-model)
- [参数微调](#finetuning)
- [模型评估](#evaluation)
- [模型推断](#inference)
- [已有模型及其性能](#supported-models)
## 安装
在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明来更新PaddlePaddle。
## 数据准备
下面给出了ImageNet分类任务的样例,首先,通过如下的方式进行数据的准备:
```
cd data/ILSVRC2012/
sh download_imagenet2012.sh
```
```download_imagenet2012.sh```脚本中,通过下面三步来准备数据:
**步骤一:** 首先在```image-net.org```网站上完成注册,用于获得一对```Username``````AccessKey```
**步骤二:** 从ImageNet官网下载ImageNet-2012的图像数据。训练以及验证数据集会分别被下载到"train" 和 "val" 目录中。请注意,ImaegNet数据的大小超过40GB,下载非常耗时;已经自行下载ImageNet的用户可以直接将数据组织放置到```data/ILSVRC2012```
**步骤三:** 下载训练与验证集合对应的标签文件。下面两个文件分别包含了训练集合与验证集合中图像的标签:
* *train_list.txt*: ImageNet-2012训练集合的标签文件,每一行采用"空格"分隔图像路径与标注,例如:
```
train/n02483708/n02483708_2436.jpeg 369
train/n03998194/n03998194_7015.jpeg 741
train/n04523525/n04523525_38118.jpeg 884
train/n04596742/n04596742_3032.jpeg 909
train/n03208938/n03208938_7065.jpeg 535
...
```
* *val_list.txt*: ImageNet-2012验证集合的标签文件,每一行采用"空格"分隔图像路径与标注,例如:
```
val/ILSVRC2012_val_00000001.jpeg 65
val/ILSVRC2012_val_00000002.jpeg 970
val/ILSVRC2012_val_00000003.jpeg 230
val/ILSVRC2012_val_00000004.jpeg 809
val/ILSVRC2012_val_00000005.jpeg 516
...
```
## 模型训练
数据准备完毕后,可以通过如下的方式启动训练:
```
python train.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=False \
--lr_strategy=piecewise_decay \
--lr=0.1
```
**参数说明:**
* **model**: name model to use. Default: "SE_ResNeXt50_32x4d".
* **num_epochs**: the number of epochs. Default: 120.
* **batch_size**: the size of each mini-batch. Default: 256.
* **use_gpu**: whether to use GPU or not. Default: True.
* **total_images**: total number of images in the training set. Default: 1281167.
* **class_dim**: the class number of the classification task. Default: 1000.
* **image_shape**: input size of the network. Default: "3,224,224".
* **model_save_dir**: the directory to save trained model. Default: "output".
* **with_mem_opt**: whether to use memory optimization or not. Default: False.
* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay".
* **lr**: initialized learning rate. Default: 0.1.
* **pretrained_model**: model path for pretraining. Default: None.
* **checkpoint**: the checkpoint path to resume. Default: None.
**数据读取器说明:** 数据读取器定义在```reader.py```中。在[训练阶段](#training-a-model), 默认采用的增广方式是随机裁剪与水平翻转, 而在[评估](#inference)[推断](#inference)阶段用的默认方式是中心裁剪。当前支持的数据增广方式有:
* 旋转
* 颜色抖动
* 随机裁剪
* 中心裁剪
* 长宽调整
* 水平翻转
**训练曲线:** 通过训练过程中的日志可以画出训练曲线。举个例子,训练AlexNet出来的日志如下所示:
```
End pass 1, train_loss 6.23153877258, train_acc1 0.0150696625933, train_acc5 0.0552518665791, test_loss 5.41981744766, test_acc1 0.0519132651389, test_acc5 0.156150355935
End pass 2, train_loss 5.15442800522, train_acc1 0.0784279331565, train_acc5 0.211050540209, test_loss 4.45795249939, test_acc1 0.140469551086, test_acc5 0.333163291216
End pass 3, train_loss 4.51505613327, train_acc1 0.145300447941, train_acc5 0.331567406654, test_loss 3.86548018456, test_acc1 0.219443559647, test_acc5 0.446448504925
End pass 4, train_loss 4.12735557556, train_acc1 0.19437250495, train_acc5 0.405713528395, test_loss 3.56990146637, test_acc1 0.264536827803, test_acc5 0.507190704346
End pass 5, train_loss 3.87505435944, train_acc1 0.229518383741, train_acc5 0.453582793474, test_loss 3.35345435143, test_acc1 0.297349333763, test_acc5 0.54753267765
End pass 6, train_loss 3.6929500103, train_acc1 0.255628824234, train_acc5 0.487188398838, test_loss 3.17112898827, test_acc1 0.326953113079, test_acc5 0.581780135632
End pass 7, train_loss 3.55882954597, train_acc1 0.275381118059, train_acc5 0.511990904808, test_loss 3.03736782074, test_acc1 0.349035382271, test_acc5 0.606293857098
End pass 8, train_loss 3.45595097542, train_acc1 0.291462600231, train_acc5 0.530815005302, test_loss 2.96034455299, test_acc1 0.362228929996, test_acc5 0.617390751839
End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545210540295, test_loss 2.93932366371, test_acc1 0.37129303813, test_acc5 0.623573005199
...
```
下图给出了AlexNet、ResNet50以及SE-ResNeXt-50网络的错误率曲线:
<p align="center">
<img src="images/curve.jpg" height=480 width=640 hspace='10'/> <br />
训练集合与验证集合上的错误率曲线
</p>
## 参数微调
参数微调是指在特定任务上微调已训练模型的参数。通过初始化```path_to_pretrain_model```,微调一个模型可以采用如下的命令:
```
python train.py
--model=SE_ResNeXt50_32x4d \
--pretrained_model=${path_to_pretrain_model} \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000 \
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=True \
--lr_strategy=piecewise_decay \
--lr=0.1
```
## 模型评估
模型评估是指对训练完毕的模型评估各类性能指标。用户可以下载[预训练模型](#supported-models)并且设置```path_to_pretrain_model```为模型所在路径。运行如下的命令,可以获得一个模型top-1/top-5精度:
```
python eval.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
根据这个评估程序的配置,输出日志形式如下:
```
Testbatch 0,loss 2.1786134243, acc1 0.625,acc5 0.8125,time 0.48 sec
Testbatch 10,loss 0.898496925831, acc1 0.75,acc5 0.9375,time 0.51 sec
Testbatch 20,loss 1.32524681091, acc1 0.6875,acc5 0.9375,time 0.37 sec
Testbatch 30,loss 1.46830511093, acc1 0.5,acc5 0.9375,time 0.51 sec
Testbatch 40,loss 1.12802267075, acc1 0.625,acc5 0.9375,time 0.35 sec
Testbatch 50,loss 0.881597697735, acc1 0.8125,acc5 1.0,time 0.32 sec
Testbatch 60,loss 0.300163716078, acc1 0.875,acc5 1.0,time 0.48 sec
Testbatch 70,loss 0.692037761211, acc1 0.875,acc5 1.0,time 0.35 sec
Testbatch 80,loss 0.0969972759485, acc1 1.0,acc5 1.0,time 0.41 sec
...
```
## 模型推断
模型推断可以获取一个模型的预测分数或者图像的特征:
```
python infer.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
输出的预测结果包括最高分数(未经过softmax处理)以及相应的预测标签。
```
Test-0-score: [13.168352], class [491]
Test-1-score: [7.913302], class [975]
Test-2-score: [16.959702], class [21]
Test-3-score: [14.197695], class [383]
Test-4-score: [12.607652], class [878]
Test-5-score: [17.725458], class [15]
Test-6-score: [12.678599], class [118]
Test-7-score: [12.353498], class [505]
Test-8-score: [20.828007], class [747]
Test-9-score: [15.135801], class [315]
Test-10-score: [14.585114], class [920]
Test-11-score: [13.739927], class [679]
Test-12-score: [15.040644], class [386]
...
```
## 已有模型及其性能
表格中列出了在"models"目录下支持的神经网络种类,并且给出了已完成训练的模型在ImageNet-2012验证集合上的top-1/top-5精度;如无特征说明,训练模型的初始学习率为```0.1```,每隔预定的epochs会下降```0.1```。预训练模型可以通过点击相应模型的名称进行下载。
|model | top-1/top-5 accuracy
|- | -:
|[AlexNet](http://paddle-imagenet-models.bj.bcebos.com/alexnet_model.tar) | 57.21%/79.72%
|VGG11 | -
|VGG13 | -
|VGG16 | -
|VGG19 | -
|GoogleNet | -
|InceptionV4 | -
|MobileNet | -
|[ResNet50](http://paddle-imagenet-models.bj.bcebos.com/resnet_50_model.tar) | 76.63%/93.10%
|ResNet101 | -
|ResNet152 | -
|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.33%/93.96%
|SE_ResNeXt101_32x4d | -
|SE_ResNeXt152_32x4d | -
|DPN68 | -
|DPN92 | -
|DPN98 | -
|DPN107 | -
|DPN131 | -
set -e
if [ "x${IMAGENET_USERNAME}" == x -o "x${IMAGENET_ACCESS_KEY}" == x ];then
echo "Please create an account on image-net.org."
echo "It will provide you a pair of username and accesskey to download imagenet data."
read -p "Username: " IMAGENET_USERNAME
read -p "Accesskey: " IMAGENET_ACCESS_KEY
fi
root_url=http://www.image-net.org/challenges/LSVRC/2012/nnoupb
valid_tar=ILSVRC2012_img_val.tar
train_tar=ILSVRC2012_img_train.tar
train_folder=train/
valid_folder=val/
echo "Download imagenet training data..."
mkdir -p ${train_folder}
wget -nd -c ${root_url}/${train_tar}
tar xf ${train_tar} -C ${train_folder}
cd ${train_folder}
for x in `ls *.tar`
do
filename=`basename $x .tar`
mkdir -p $filename
tar -xf $x -C $filename
rm -rf $x
done
cd -
echo "Download imagenet validation data..."
mkdir -p ${valid_folder}
wget -nd -c ${root_url}/${valid_tar}
tar xf ${valid_tar} -C ${valid_folder}
echo "Download imagenet label file: val_list.txt & train_list.txt"
label_file=ImageNet_label.tgz
label_url=http://imagenet-data.bj.bcebos.com/${label_file}
wget -nd -c ${label_url}
tar zxf ${label_file}
cd train
dir=./
for x in `ls *.tar`
do
filename=`basename $x .tar`
mkdir $filename
tar -xvf $x -C ./$filename
done
import os import os
import sys
import numpy as np import numpy as np
import argparse import time
import functools import sys
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from utility import add_arguments, print_arguments import models
from se_resnext import SE_ResNeXt
import reader import reader
import argparse
import functools
from models.learning_rate import cosine_decay
from utility import add_arguments, print_arguments
import math
parser = argparse.ArgumentParser(description=__doc__) parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser) add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable # yapf: disable
add_arg('batch_size', int, 32, "Minibatch size.") add_arg('batch_size', int, 256, "Minibatch size.")
add_arg('use_gpu', bool, True, "Whether to use GPU or not.") add_arg('use_gpu', bool, True, "Whether to use GPU or not.")
add_arg('test_list', str, '', "The testing data lists.") add_arg('class_dim', int, 1000, "Class number.")
add_arg('num_layers', int, 50, "How many layers for SE-ResNeXt model.") add_arg('image_shape', str, "3,224,224", "Input image size")
add_arg('model_dir', str, '', "The model path.") add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.")
add_arg('pretrained_model', str, None, "Whether to use pretrained model.")
add_arg('model', str, "SE_ResNeXt50_32x4d", "Set the network to use.")
# yapf: enable # yapf: enable
model_list = [m for m in dir(models) if "__" not in m]
def eval(args): def eval(args):
class_dim = 1000 # parameters from arguments
image_shape = [3, 224, 224] class_dim = args.class_dim
model_name = args.model
pretrained_model = args.pretrained_model
with_memory_optimization = args.with_mem_opt
image_shape = [int(m) for m in args.image_shape.split(",")]
assert model_name in model_list, "{} is not in lists: {}".format(args.model,
model_list)
image = fluid.layers.data(name='image', shape=image_shape, dtype='float32') image = fluid.layers.data(name='image', shape=image_shape, dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64') label = fluid.layers.data(name='label', shape=[1], dtype='int64')
out = SE_ResNeXt(input=image, class_dim=class_dim, layers=args.num_layers)
cost = fluid.layers.cross_entropy(input=out, label=label)
acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1)
acc_top5 = fluid.layers.accuracy(input=out, label=label, k=5)
avg_cost = fluid.layers.mean(x=cost)
inference_program = fluid.default_main_program().clone(for_test=True) # model definition
model = models.__dict__[model_name]()
if model_name is "GoogleNet":
out0, out1, out2 = model.net(input=image, class_dim=class_dim)
cost0 = fluid.layers.cross_entropy(input=out0, label=label)
cost1 = fluid.layers.cross_entropy(input=out1, label=label)
cost2 = fluid.layers.cross_entropy(input=out2, label=label)
avg_cost0 = fluid.layers.mean(x=cost0)
avg_cost1 = fluid.layers.mean(x=cost1)
avg_cost2 = fluid.layers.mean(x=cost2)
avg_cost = avg_cost0 + 0.3 * avg_cost1 + 0.3 * avg_cost2
acc_top1 = fluid.layers.accuracy(input=out0, label=label, k=1)
acc_top5 = fluid.layers.accuracy(input=out0, label=label, k=5)
else:
out = model.net(input=image, class_dim=class_dim)
cost = fluid.layers.cross_entropy(input=out, label=label)
avg_cost = fluid.layers.mean(x=cost)
acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1)
acc_top5 = fluid.layers.accuracy(input=out, label=label, k=5)
test_program = fluid.default_main_program().clone(for_test=True)
if with_memory_optimization:
fluid.memory_optimize(fluid.default_main_program())
place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace() place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place) exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
if not os.path.exists(args.model_dir): if pretrained_model:
raise ValueError("The model path [%s] does not exist." %
(args.model_dir))
if not os.path.exists(args.test_list):
raise ValueError("The test lists [%s] does not exist." %
(args.test_list))
def if_exist(var): def if_exist(var):
return os.path.exists(os.path.join(args.model_dir, var.name)) return os.path.exists(os.path.join(pretrained_model, var.name))
fluid.io.load_vars(exe, args.model_dir, predicate=if_exist) fluid.io.load_vars(exe, pretrained_model, predicate=if_exist)
test_reader = paddle.batch( val_reader = paddle.batch(reader.val(), batch_size=args.batch_size)
reader.test(args.test_list), batch_size=args.batch_size)
feeder = fluid.DataFeeder(place=place, feed_list=[image, label]) feeder = fluid.DataFeeder(place=place, feed_list=[image, label])
fetch_list = [avg_cost, acc_top1, acc_top5] fetch_list = [avg_cost.name, acc_top1.name, acc_top5.name]
test_info = [[], [], []] test_info = [[], [], []]
for batch_id, data in enumerate(test_reader()): cnt = 0
loss, acc1, acc5 = exe.run(inference_program, for batch_id, data in enumerate(val_reader()):
feed=feeder.feed(data), t1 = time.time()
fetch_list=fetch_list) loss, acc1, acc5 = exe.run(test_program,
test_info[0].append(loss[0]) fetch_list=fetch_list,
test_info[1].append(acc1[0]) feed=feeder.feed(data))
test_info[2].append(acc5[0]) t2 = time.time()
if batch_id % 1 == 0: period = t2 - t1
print("Test {0}, loss {1}, acc1 {2}, acc5 {3}" loss = np.mean(loss)
.format(batch_id, loss[0], acc1[0], acc5[0])) acc1 = np.mean(acc1)
acc5 = np.mean(acc5)
test_info[0].append(loss * len(data))
test_info[1].append(acc1 * len(data))
test_info[2].append(acc5 * len(data))
cnt += len(data)
if batch_id % 10 == 0:
print("Testbatch {0},loss {1}, "
"acc1 {2},acc5 {3},time {4}".format(batch_id, \
loss, acc1, acc5, \
"%2.2f sec" % period))
sys.stdout.flush() sys.stdout.flush()
test_loss = np.array(test_info[0]).mean() test_loss = np.sum(test_info[0]) / cnt
test_acc1 = np.array(test_info[1]).mean() test_acc1 = np.sum(test_info[1]) / cnt
test_acc5 = np.array(test_info[2]).mean() test_acc5 = np.sum(test_info[2]) / cnt
print("Test loss {0}, acc1 {1}, acc5 {2}".format(test_loss, test_acc1, print("Test_loss {0}, test_acc1 {1}, test_acc5 {2}".format(
test_acc5)) test_loss, test_acc1, test_acc5))
sys.stdout.flush() sys.stdout.flush()
if __name__ == '__main__': def main():
args = parser.parse_args() args = parser.parse_args()
print_arguments(args) print_arguments(args)
eval(args) eval(args)
if __name__ == '__main__':
main()
import os
import paddle.fluid as fluid
def inception_v4(img, class_dim):
tmp = stem(input=img)
for i in range(1):
tmp = inception_A(input=tmp, depth=i)
tmp = reduction_A(input=tmp)
for i in range(7):
tmp = inception_B(input=tmp, depth=i)
reduction_B(input=tmp)
for i in range(3):
tmp = inception_C(input=tmp, depth=i)
pool = fluid.layers.pool2d(
pool_type='avg', input=tmp, pool_size=7, pool_stride=1)
dropout = fluid.layers.dropout(x=pool, dropout_prob=0.2)
fc = fluid.layers.fc(input=dropout, size=class_dim, act='softmax')
out = fluid.layers.softmax(input=fc)
return out
def conv_bn_layer(name,
input,
num_filters,
filter_size,
padding=0,
stride=1,
groups=1,
act=None):
conv = fluid.layers.conv2d(
name=name,
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=padding,
groups=groups,
act=None,
bias_attr=False)
return fluid.layers.batch_norm(name=name + '_norm', input=conv, act=act)
def stem(input):
conv0 = conv_bn_layer(
name='stem_conv_0',
input=input,
num_filters=32,
filter_size=3,
padding=1,
stride=2)
conv1 = conv_bn_layer(
name='stem_conv_1',
input=conv0,
num_filters=32,
filter_size=3,
padding=1)
conv2 = conv_bn_layer(
name='stem_conv_2',
input=conv1,
num_filters=64,
filter_size=3,
padding=1)
def block0(input):
pool0 = fluid.layers.pool2d(
input=input,
pool_size=3,
pool_stride=2,
pool_type='max',
pool_padding=1)
conv0 = conv_bn_layer(
name='stem_block0_conv',
input=input,
num_filters=96,
filter_size=3,
stride=2,
padding=1)
return fluid.layers.concat(input=[pool0, conv0], axis=1)
def block1(input):
l_conv0 = conv_bn_layer(
name='stem_block1_l_conv0',
input=input,
num_filters=64,
filter_size=1,
stride=1,
padding=0)
l_conv1 = conv_bn_layer(
name='stem_block1_l_conv1',
input=l_conv0,
num_filters=96,
filter_size=3,
stride=1,
padding=1)
r_conv0 = conv_bn_layer(
name='stem_block1_r_conv0',
input=input,
num_filters=64,
filter_size=1,
stride=1,
padding=0)
r_conv1 = conv_bn_layer(
name='stem_block1_r_conv1',
input=r_conv0,
num_filters=64,
filter_size=(7, 1),
stride=1,
padding=(3, 0))
r_conv2 = conv_bn_layer(
name='stem_block1_r_conv2',
input=r_conv1,
num_filters=64,
filter_size=(1, 7),
stride=1,
padding=(0, 3))
r_conv3 = conv_bn_layer(
name='stem_block1_r_conv3',
input=r_conv2,
num_filters=96,
filter_size=3,
stride=1,
padding=1)
return fluid.layers.concat(input=[l_conv1, r_conv3], axis=1)
def block2(input):
conv0 = conv_bn_layer(
name='stem_block2_conv',
input=input,
num_filters=192,
filter_size=3,
stride=2,
padding=1)
pool0 = fluid.layers.pool2d(
input=input,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
return fluid.layers.concat(input=[conv0, pool0], axis=1)
conv3 = block0(conv2)
conv4 = block1(conv3)
conv5 = block2(conv4)
return conv5
def inception_A(input, depth):
b0_pool0 = fluid.layers.pool2d(
name='inceptA{0}_branch0_pool0'.format(depth),
input=input,
pool_size=3,
pool_stride=1,
pool_padding=1,
pool_type='avg')
b0_conv0 = conv_bn_layer(
name='inceptA{0}_branch0_conv0'.format(depth),
input=b0_pool0,
num_filters=96,
filter_size=1,
stride=1,
padding=0)
b1_conv0 = conv_bn_layer(
name='inceptA{0}_branch1_conv0'.format(depth),
input=input,
num_filters=96,
filter_size=1,
stride=1,
padding=0)
b2_conv0 = conv_bn_layer(
name='inceptA{0}_branch2_conv0'.format(depth),
input=input,
num_filters=64,
filter_size=1,
stride=1,
padding=0)
b2_conv1 = conv_bn_layer(
name='inceptA{0}_branch2_conv1'.format(depth),
input=b2_conv0,
num_filters=96,
filter_size=3,
stride=1,
padding=1)
b3_conv0 = conv_bn_layer(
name='inceptA{0}_branch3_conv0'.format(depth),
input=input,
num_filters=64,
filter_size=1,
stride=1,
padding=0)
b3_conv1 = conv_bn_layer(
name='inceptA{0}_branch3_conv1'.format(depth),
input=b3_conv0,
num_filters=96,
filter_size=3,
stride=1,
padding=1)
b3_conv2 = conv_bn_layer(
name='inceptA{0}_branch3_conv2'.format(depth),
input=b3_conv1,
num_filters=96,
filter_size=3,
stride=1,
padding=1)
return fluid.layers.concat(
input=[b0_conv0, b1_conv0, b2_conv1, b3_conv2], axis=1)
def reduction_A(input):
b0_pool0 = fluid.layers.pool2d(
name='ReductA_branch0_pool0',
input=input,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
b1_conv0 = conv_bn_layer(
name='ReductA_branch1_conv0',
input=input,
num_filters=384,
filter_size=3,
stride=2,
padding=1)
b2_conv0 = conv_bn_layer(
name='ReductA_branch2_conv0',
input=input,
num_filters=192,
filter_size=1,
stride=1,
padding=0)
b2_conv1 = conv_bn_layer(
name='ReductA_branch2_conv1',
input=b2_conv0,
num_filters=224,
filter_size=3,
stride=1,
padding=1)
b2_conv2 = conv_bn_layer(
name='ReductA_branch2_conv2',
input=b2_conv1,
num_filters=256,
filter_size=3,
stride=2,
padding=1)
return fluid.layers.concat(input=[b0_pool0, b1_conv0, b2_conv2], axis=1)
def inception_B(input, depth):
b0_pool0 = fluid.layers.pool2d(
name='inceptB{0}_branch0_pool0'.format(depth),
input=input,
pool_size=3,
pool_stride=1,
pool_padding=1,
pool_type='avg')
b0_conv0 = conv_bn_layer(
name='inceptB{0}_branch0_conv0'.format(depth),
input=b0_pool0,
num_filters=128,
filter_size=1,
stride=1,
padding=0)
b1_conv0 = conv_bn_layer(
name='inceptB{0}_branch1_conv0'.format(depth),
input=input,
num_filters=384,
filter_size=1,
stride=1,
padding=0)
b2_conv0 = conv_bn_layer(
name='inceptB{0}_branch2_conv0'.format(depth),
input=input,
num_filters=192,
filter_size=1,
stride=1,
padding=0)
b2_conv1 = conv_bn_layer(
name='inceptB{0}_branch2_conv1'.format(depth),
input=b2_conv0,
num_filters=224,
filter_size=(1, 7),
stride=1,
padding=(0, 3))
b2_conv2 = conv_bn_layer(
name='inceptB{0}_branch2_conv2'.format(depth),
input=b2_conv1,
num_filters=256,
filter_size=(7, 1),
stride=1,
padding=(3, 0))
b3_conv0 = conv_bn_layer(
name='inceptB{0}_branch3_conv0'.format(depth),
input=input,
num_filters=192,
filter_size=1,
stride=1,
padding=0)
b3_conv1 = conv_bn_layer(
name='inceptB{0}_branch3_conv1'.format(depth),
input=b3_conv0,
num_filters=192,
filter_size=(1, 7),
stride=1,
padding=(0, 3))
b3_conv2 = conv_bn_layer(
name='inceptB{0}_branch3_conv2'.format(depth),
input=b3_conv1,
num_filters=224,
filter_size=(7, 1),
stride=1,
padding=(3, 0))
b3_conv3 = conv_bn_layer(
name='inceptB{0}_branch3_conv3'.format(depth),
input=b3_conv2,
num_filters=224,
filter_size=(1, 7),
stride=1,
padding=(0, 3))
b3_conv4 = conv_bn_layer(
name='inceptB{0}_branch3_conv4'.format(depth),
input=b3_conv3,
num_filters=256,
filter_size=(7, 1),
stride=1,
padding=(3, 0))
return fluid.layers.concat(
input=[b0_conv0, b1_conv0, b2_conv2, b3_conv4], axis=1)
def reduction_B(input):
b0_pool0 = fluid.layers.pool2d(
name='ReductB_branch0_pool0',
input=input,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
b1_conv0 = conv_bn_layer(
name='ReductB_branch1_conv0',
input=input,
num_filters=192,
filter_size=1,
stride=1,
padding=0)
b1_conv1 = conv_bn_layer(
name='ReductB_branch1_conv1',
input=b1_conv0,
num_filters=192,
filter_size=3,
stride=2,
padding=1)
b2_conv0 = conv_bn_layer(
name='ReductB_branch2_conv0',
input=input,
num_filters=256,
filter_size=1,
stride=1,
padding=0)
b2_conv1 = conv_bn_layer(
name='ReductB_branch2_conv1',
input=b2_conv0,
num_filters=256,
filter_size=(1, 7),
stride=1,
padding=(0, 3))
b2_conv2 = conv_bn_layer(
name='ReductB_branch2_conv2',
input=b2_conv1,
num_filters=320,
filter_size=(7, 1),
stride=1,
padding=(3, 0))
b2_conv3 = conv_bn_layer(
name='ReductB_branch2_conv3',
input=b2_conv2,
num_filters=320,
filter_size=3,
stride=2,
padding=1)
return fluid.layers.concat(input=[b0_pool0, b1_conv1, b2_conv3], axis=1)
def inception_C(input, depth):
b0_pool0 = fluid.layers.pool2d(
name='inceptC{0}_branch0_pool0'.format(depth),
input=input,
pool_size=3,
pool_stride=1,
pool_padding=1,
pool_type='avg')
b0_conv0 = conv_bn_layer(
name='inceptC{0}_branch0_conv0'.format(depth),
input=b0_pool0,
num_filters=256,
filter_size=1,
stride=1,
padding=0)
b1_conv0 = conv_bn_layer(
name='inceptC{0}_branch1_conv0'.format(depth),
input=input,
num_filters=256,
filter_size=1,
stride=1,
padding=0)
b2_conv0 = conv_bn_layer(
name='inceptC{0}_branch2_conv0'.format(depth),
input=input,
num_filters=384,
filter_size=1,
stride=1,
padding=0)
b2_conv1 = conv_bn_layer(
name='inceptC{0}_branch2_conv1'.format(depth),
input=b2_conv0,
num_filters=256,
filter_size=(1, 3),
stride=1,
padding=(0, 1))
b2_conv2 = conv_bn_layer(
name='inceptC{0}_branch2_conv2'.format(depth),
input=b2_conv0,
num_filters=256,
filter_size=(3, 1),
stride=1,
padding=(1, 0))
b3_conv0 = conv_bn_layer(
name='inceptC{0}_branch3_conv0'.format(depth),
input=input,
num_filters=384,
filter_size=1,
stride=1,
padding=0)
b3_conv1 = conv_bn_layer(
name='inceptC{0}_branch3_conv1'.format(depth),
input=b3_conv0,
num_filters=448,
filter_size=(1, 3),
stride=1,
padding=(0, 1))
b3_conv2 = conv_bn_layer(
name='inceptC{0}_branch3_conv2'.format(depth),
input=b3_conv1,
num_filters=512,
filter_size=(3, 1),
stride=1,
padding=(1, 0))
b3_conv3 = conv_bn_layer(
name='inceptC{0}_branch3_conv3'.format(depth),
input=b3_conv2,
num_filters=256,
filter_size=(3, 1),
stride=1,
padding=(1, 0))
b3_conv4 = conv_bn_layer(
name='inceptC{0}_branch3_conv4'.format(depth),
input=b3_conv2,
num_filters=256,
filter_size=(1, 3),
stride=1,
padding=(0, 1))
return fluid.layers.concat(
input=[b0_conv0, b1_conv0, b2_conv1, b2_conv2, b3_conv3, b3_conv4],
axis=1)
import os import os
import sys
import numpy as np import numpy as np
import argparse import time
import functools import sys
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from utility import add_arguments, print_arguments import models
from se_resnext import SE_ResNeXt
import reader import reader
import argparse
import functools
from models.learning_rate import cosine_decay
from utility import add_arguments, print_arguments
import math
parser = argparse.ArgumentParser(description=__doc__) parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable # yapf: disable
add_arg('batch_size', int, 1, "Minibatch size.") add_arg = functools.partial(add_arguments, argparser=parser)
add_arg('use_gpu', bool, True, "Whether to use GPU or not.") add_arg('batch_size', int, 256, "Minibatch size.")
add_arg('test_list', str, '', "The testing data lists.") add_arg('use_gpu', bool, True, "Whether to use GPU or not.")
add_arg('num_layers', int, 50, "How many layers for SE-ResNeXt model.") add_arg('class_dim', int, 1000, "Class number.")
add_arg('model_dir', str, '', "The model path.") add_arg('image_shape', str, "3,224,224", "Input image size")
add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.")
add_arg('pretrained_model', str, None, "Whether to use pretrained model.")
add_arg('model', str, "SE_ResNeXt50_32x4d", "Set the network to use.")
# yapf: enable # yapf: enable
model_list = [m for m in dir(models) if "__" not in m]
def infer(args): def infer(args):
class_dim = 1000 # parameters from arguments
image_shape = [3, 224, 224] class_dim = args.class_dim
model_name = args.model
pretrained_model = args.pretrained_model
with_memory_optimization = args.with_mem_opt
image_shape = [int(m) for m in args.image_shape.split(",")]
assert model_name in model_list, "{} is not in lists: {}".format(args.model,
model_list)
image = fluid.layers.data(name='image', shape=image_shape, dtype='float32') image = fluid.layers.data(name='image', shape=image_shape, dtype='float32')
out = SE_ResNeXt(input=image, class_dim=class_dim, layers=args.num_layers)
out = fluid.layers.softmax(input=out)
inference_program = fluid.default_main_program().clone(for_test=True) # model definition
model = models.__dict__[model_name]()
if model_name is "GoogleNet":
out, _, _ = model.net(input=image, class_dim=class_dim)
else:
out = model.net(input=image, class_dim=class_dim)
test_program = fluid.default_main_program().clone(for_test=True)
if with_memory_optimization:
fluid.memory_optimize(fluid.default_main_program())
place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace() place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place) exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
if not os.path.exists(args.model_dir): if pretrained_model:
raise ValueError("The model path [%s] does not exist." %
(args.model_dir))
if not os.path.exists(args.test_list):
raise ValueError("The test lists [%s] does not exist." %
(args.test_list))
def if_exist(var): def if_exist(var):
return os.path.exists(os.path.join(args.model_dir, var.name)) return os.path.exists(os.path.join(pretrained_model, var.name))
fluid.io.load_vars(exe, args.model_dir, predicate=if_exist) fluid.io.load_vars(exe, pretrained_model, predicate=if_exist)
test_reader = paddle.batch( test_batch_size = 1
reader.infer(args.test_list), batch_size=args.batch_size) test_reader = paddle.batch(reader.test(), batch_size=test_batch_size)
feeder = fluid.DataFeeder(place=place, feed_list=[image]) feeder = fluid.DataFeeder(place=place, feed_list=[image])
fetch_list = [out] fetch_list = [out.name]
TOPK = 1 TOPK = 1
for batch_id, data in enumerate(test_reader()): for batch_id, data in enumerate(test_reader()):
result = exe.run(inference_program, result = exe.run(test_program,
feed=feeder.feed(data), fetch_list=fetch_list,
fetch_list=fetch_list) feed=feeder.feed(data))
result = result[0] result = result[0][0]
pred_label = np.argsort(result)[::-1][0][0] pred_label = np.argsort(result)[::-1][:TOPK]
print("Test {0}-score {1}, class {2}: " print("Test-{0}-score: {1}, class {2}"
.format(batch_id, result[0][pred_label], pred_label)) .format(batch_id, result[pred_label], pred_label))
sys.stdout.flush() sys.stdout.flush()
if __name__ == '__main__': def main():
args = parser.parse_args() args = parser.parse_args()
print_arguments(args) print_arguments(args)
infer(args) infer(args)
if __name__ == '__main__':
main()
import os
import paddle.v2 as paddle
import paddle.fluid as fluid
from paddle.fluid.initializer import MSRA
from paddle.fluid.param_attr import ParamAttr
parameter_attr = ParamAttr(initializer=MSRA())
def conv_bn_layer(input,
filter_size,
num_filters,
stride,
padding,
channels=None,
num_groups=1,
act='relu',
use_cudnn=True):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=padding,
groups=num_groups,
act=None,
use_cudnn=use_cudnn,
param_attr=parameter_attr,
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def depthwise_separable(input, num_filters1, num_filters2, num_groups, stride,
scale):
"""
"""
depthwise_conv = conv_bn_layer(
input=input,
filter_size=3,
num_filters=int(num_filters1 * scale),
stride=stride,
padding=1,
num_groups=int(num_groups * scale),
use_cudnn=False)
pointwise_conv = conv_bn_layer(
input=depthwise_conv,
filter_size=1,
num_filters=int(num_filters2 * scale),
stride=1,
padding=0)
return pointwise_conv
def mobile_net(img, class_dim, scale=1.0):
# conv1: 112x112
tmp = conv_bn_layer(
img,
filter_size=3,
channels=3,
num_filters=int(32 * scale),
stride=2,
padding=1)
# 56x56
tmp = depthwise_separable(
tmp,
num_filters1=32,
num_filters2=64,
num_groups=32,
stride=1,
scale=scale)
tmp = depthwise_separable(
tmp,
num_filters1=64,
num_filters2=128,
num_groups=64,
stride=2,
scale=scale)
# 28x28
tmp = depthwise_separable(
tmp,
num_filters1=128,
num_filters2=128,
num_groups=128,
stride=1,
scale=scale)
tmp = depthwise_separable(
tmp,
num_filters1=128,
num_filters2=256,
num_groups=128,
stride=2,
scale=scale)
# 14x14
tmp = depthwise_separable(
tmp,
num_filters1=256,
num_filters2=256,
num_groups=256,
stride=1,
scale=scale)
tmp = depthwise_separable(
tmp,
num_filters1=256,
num_filters2=512,
num_groups=256,
stride=2,
scale=scale)
# 14x14
for i in range(5):
tmp = depthwise_separable(
tmp,
num_filters1=512,
num_filters2=512,
num_groups=512,
stride=1,
scale=scale)
# 7x7
tmp = depthwise_separable(
tmp,
num_filters1=512,
num_filters2=1024,
num_groups=512,
stride=2,
scale=scale)
tmp = depthwise_separable(
tmp,
num_filters1=1024,
num_filters2=1024,
num_groups=1024,
stride=1,
scale=scale)
tmp = fluid.layers.pool2d(
input=tmp,
pool_size=0,
pool_stride=1,
pool_type='avg',
global_pooling=True)
tmp = fluid.layers.fc(input=tmp,
size=class_dim,
act='softmax',
param_attr=parameter_attr)
return tmp
from .alexnet import AlexNet
from .mobilenet import MobileNet
from .googlenet import GoogleNet
from .vgg import VGG11, VGG13, VGG16, VGG19
from .resnet import ResNet50, ResNet101, ResNet152
from .inception_v4 import InceptionV4
from .se_resnext import SE_ResNeXt50_32x4d, SE_ResNeXt101_32x4d, SE_ResNeXt152_32x4d
from .dpn import DPN68, DPN92, DPN98, DPN107, DPN131
import paddle
import paddle.fluid as fluid
import math
__all__ = ['AlexNet']
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [40, 70, 100],
"steps": [0.01, 0.001, 0.0001, 0.00001]
}
}
class AlexNet():
def __init__(self):
self.params = train_parameters
def net(self, input, class_dim=1000):
stdv = 1.0 / math.sqrt(input.shape[1] * 11 * 11)
conv1 = fluid.layers.conv2d(
input=input,
num_filters=64,
filter_size=11,
stride=4,
padding=2,
groups=1,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
pool1 = fluid.layers.pool2d(
input=conv1,
pool_size=3,
pool_stride=2,
pool_padding=0,
pool_type='max')
stdv = 1.0 / math.sqrt(pool1.shape[1] * 5 * 5)
conv2 = fluid.layers.conv2d(
input=pool1,
num_filters=192,
filter_size=5,
stride=1,
padding=2,
groups=1,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
pool2 = fluid.layers.pool2d(
input=conv2,
pool_size=3,
pool_stride=2,
pool_padding=0,
pool_type='max')
stdv = 1.0 / math.sqrt(pool2.shape[1] * 3 * 3)
conv3 = fluid.layers.conv2d(
input=pool2,
num_filters=384,
filter_size=3,
stride=1,
padding=1,
groups=1,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
stdv = 1.0 / math.sqrt(conv3.shape[1] * 3 * 3)
conv4 = fluid.layers.conv2d(
input=conv3,
num_filters=256,
filter_size=3,
stride=1,
padding=1,
groups=1,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
stdv = 1.0 / math.sqrt(conv4.shape[1] * 3 * 3)
conv5 = fluid.layers.conv2d(
input=conv4,
num_filters=256,
filter_size=3,
stride=1,
padding=1,
groups=1,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
pool5 = fluid.layers.pool2d(
input=conv5,
pool_size=3,
pool_stride=2,
pool_padding=0,
pool_type='max')
drop6 = fluid.layers.dropout(x=pool5, dropout_prob=0.5)
stdv = 1.0 / math.sqrt(drop6.shape[1] * drop6.shape[2] *
drop6.shape[3] * 1.0)
fc6 = fluid.layers.fc(
input=drop6,
size=4096,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
drop7 = fluid.layers.dropout(x=fc6, dropout_prob=0.5)
stdv = 1.0 / math.sqrt(drop7.shape[1] * 1.0)
fc7 = fluid.layers.fc(
input=drop7,
size=4096,
act='relu',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
stdv = 1.0 / math.sqrt(fc7.shape[1] * 1.0)
out = fluid.layers.fc(
input=fc7,
size=class_dim,
act='softmax',
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)),
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
return out
import os
import numpy as np
import time
import sys
import paddle
import paddle.fluid as fluid
import paddle.fluid.layers.control_flow as control_flow
import paddle.fluid.layers.nn as nn
import paddle.fluid.layers.tensor as tensor
import math
__all__ = ["DPN", "DPN68", "DPN92", "DPN98", "DPN107", "DPN131"]
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class DPN(object):
def __init__(self, layers=68):
self.params = train_parameters
self.layers = layers
def net(self, input, class_dim=1000):
# get network args
args = self.get_net_args(self.layers)
bws = args['bw']
inc_sec = args['inc_sec']
rs = args['bw']
k_r = args['k_r']
k_sec = args['k_sec']
G = args['G']
init_num_filter = args['init_num_filter']
init_filter_size = args['init_filter_size']
init_padding = args['init_padding']
## define Dual Path Network
# conv1
conv1_x_1 = fluid.layers.conv2d(
input=input,
num_filters=init_num_filter,
filter_size=init_filter_size,
stride=2,
padding=init_padding,
groups=1,
act=None,
bias_attr=False)
conv1_x_1 = fluid.layers.batch_norm(
input=conv1_x_1, act='relu', is_test=False)
convX_x_x = fluid.layers.pool2d(
input=conv1_x_1,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
#conv2 - conv5
for gc in range(4):
bw = bws[gc]
inc = inc_sec[gc]
R = (k_r * bw) / rs[gc]
if gc == 0:
_type1 = 'proj'
_type2 = 'normal'
else:
_type1 = 'down'
_type2 = 'normal'
convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G,
_type1)
for i_ly in range(2, k_sec[gc] + 1):
convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G,
_type2)
conv5_x_x = fluid.layers.concat(convX_x_x, axis=1)
conv5_x_x = fluid.layers.batch_norm(
input=conv5_x_x, act='relu', is_test=False)
pool5 = fluid.layers.pool2d(
input=conv5_x_x,
pool_size=7,
pool_stride=1,
pool_padding=0,
pool_type='avg')
#stdv = 1.0 / math.sqrt(pool5.shape[1] * 1.0)
stdv = 0.01
param_attr = fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv))
fc6 = fluid.layers.fc(input=pool5,
size=class_dim,
act='softmax',
param_attr=param_attr)
return fc6
def get_net_args(self, layers):
if layers == 68:
k_r = 128
G = 32
k_sec = [3, 4, 12, 3]
inc_sec = [16, 32, 32, 64]
bw = [64, 128, 256, 512]
r = [64, 64, 64, 64]
init_num_filter = 10
init_filter_size = 3
init_padding = 1
elif layers == 92:
k_r = 96
G = 32
k_sec = [3, 4, 20, 3]
inc_sec = [16, 32, 24, 128]
bw = [256, 512, 1024, 2048]
r = [256, 256, 256, 256]
init_num_filter = 64
init_filter_size = 7
init_padding = 3
elif layers == 98:
k_r = 160
G = 40
k_sec = [3, 6, 20, 3]
inc_sec = [16, 32, 32, 128]
bw = [256, 512, 1024, 2048]
r = [256, 256, 256, 256]
init_num_filter = 96
init_filter_size = 7
init_padding = 3
elif layers == 107:
k_r = 200
G = 50
k_sec = [4, 8, 20, 3]
inc_sec = [20, 64, 64, 128]
bw = [256, 512, 1024, 2048]
r = [256, 256, 256, 256]
init_num_filter = 128
init_filter_size = 7
init_padding = 3
elif layers == 131:
k_r = 160
G = 40
k_sec = [4, 8, 28, 3]
inc_sec = [16, 32, 32, 128]
bw = [256, 512, 1024, 2048]
r = [256, 256, 256, 256]
init_num_filter = 128
init_filter_size = 7
init_padding = 3
else:
raise NotImplementedError
net_arg = {
'k_r': k_r,
'G': G,
'k_sec': k_sec,
'inc_sec': inc_sec,
'bw': bw,
'r': r
}
net_arg['init_num_filter'] = init_num_filter
net_arg['init_filter_size'] = init_filter_size
net_arg['init_padding'] = init_padding
return net_arg
def dual_path_factory(self,
data,
num_1x1_a,
num_3x3_b,
num_1x1_c,
inc,
G,
_type='normal'):
kw = 3
kh = 3
pw = (kw - 1) / 2
ph = (kh - 1) / 2
# type
if _type is 'proj':
key_stride = 1
has_proj = True
if _type is 'down':
key_stride = 2
has_proj = True
if _type is 'normal':
key_stride = 1
has_proj = False
# PROJ
if type(data) is list:
data_in = fluid.layers.concat([data[0], data[1]], axis=1)
else:
data_in = data
if has_proj:
c1x1_w = self.bn_ac_conv(
data=data_in,
num_filter=(num_1x1_c + 2 * inc),
kernel=(1, 1),
pad=(0, 0),
stride=(key_stride, key_stride))
data_o1, data_o2 = fluid.layers.split(
c1x1_w, num_or_sections=[num_1x1_c, 2 * inc], dim=1)
else:
data_o1 = data[0]
data_o2 = data[1]
# MAIN
c1x1_a = self.bn_ac_conv(
data=data_in, num_filter=num_1x1_a, kernel=(1, 1), pad=(0, 0))
c3x3_b = self.bn_ac_conv(
data=c1x1_a,
num_filter=num_3x3_b,
kernel=(kw, kh),
pad=(pw, ph),
stride=(key_stride, key_stride),
num_group=G)
c1x1_c = self.bn_ac_conv(
data=c3x3_b,
num_filter=(num_1x1_c + inc),
kernel=(1, 1),
pad=(0, 0))
c1x1_c1, c1x1_c2 = fluid.layers.split(
c1x1_c, num_or_sections=[num_1x1_c, inc], dim=1)
# OUTPUTS
summ = fluid.layers.elementwise_add(x=data_o1, y=c1x1_c1)
dense = fluid.layers.concat([data_o2, c1x1_c2], axis=1)
return [summ, dense]
def bn_ac_conv(self,
data,
num_filter,
kernel,
pad,
stride=(1, 1),
num_group=1):
bn_ac = fluid.layers.batch_norm(input=data, act='relu', is_test=False)
bn_ac_conv = fluid.layers.conv2d(
input=bn_ac,
num_filters=num_filter,
filter_size=kernel,
stride=stride,
padding=pad,
groups=num_group,
act=None,
bias_attr=False)
return bn_ac_conv
def DPN68():
model = DPN(layers=68)
return model
def DPN92():
model = DPN(layers=92)
return model
def DPN98():
model = DPN(layers=98)
return model
def DPN107():
model = DPN(layers=107)
return model
def DPN131():
model = DPN(layers=131)
return model
import paddle
import paddle.fluid as fluid
__all__ = ['GoogleNet']
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class GoogleNet():
def __init__(self):
self.params = train_parameters
def conv_layer(self,
input,
num_filters,
filter_size,
stride=1,
groups=1,
act=None):
channels = input.shape[1]
stdv = (3.0 / (filter_size**2 * channels))**0.5
param_attr = fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv))
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=(filter_size - 1) / 2,
groups=groups,
act=act,
param_attr=param_attr,
bias_attr=False)
return conv
def xavier(self, channels, filter_size):
stdv = (3.0 / (filter_size**2 * channels))**0.5
param_attr = fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv))
return param_attr
def inception(self, name, input, channels, filter1, filter3R, filter3,
filter5R, filter5, proj):
conv1 = self.conv_layer(
input=input, num_filters=filter1, filter_size=1, stride=1, act=None)
conv3r = self.conv_layer(
input=input,
num_filters=filter3R,
filter_size=1,
stride=1,
act=None)
conv3 = self.conv_layer(
input=conv3r,
num_filters=filter3,
filter_size=3,
stride=1,
act=None)
conv5r = self.conv_layer(
input=input,
num_filters=filter5R,
filter_size=1,
stride=1,
act=None)
conv5 = self.conv_layer(
input=conv5r,
num_filters=filter5,
filter_size=5,
stride=1,
act=None)
pool = fluid.layers.pool2d(
input=input,
pool_size=3,
pool_stride=1,
pool_padding=1,
pool_type='max')
convprj = fluid.layers.conv2d(
input=pool, filter_size=1, num_filters=proj, stride=1, padding=0)
cat = fluid.layers.concat(input=[conv1, conv3, conv5, convprj], axis=1)
cat = fluid.layers.relu(cat)
return cat
def net(self, input, class_dim=1000):
conv = self.conv_layer(
input=input, num_filters=64, filter_size=7, stride=2, act=None)
pool = fluid.layers.pool2d(
input=conv, pool_size=3, pool_type='max', pool_stride=2)
conv = self.conv_layer(
input=pool, num_filters=64, filter_size=1, stride=1, act=None)
conv = self.conv_layer(
input=conv, num_filters=192, filter_size=3, stride=1, act=None)
pool = fluid.layers.pool2d(
input=conv, pool_size=3, pool_type='max', pool_stride=2)
ince3a = self.inception("ince3a", pool, 192, 64, 96, 128, 16, 32, 32)
ince3b = self.inception("ince3b", ince3a, 256, 128, 128, 192, 32, 96,
64)
pool3 = fluid.layers.pool2d(
input=ince3b, pool_size=3, pool_type='max', pool_stride=2)
ince4a = self.inception("ince4a", pool3, 480, 192, 96, 208, 16, 48, 64)
ince4b = self.inception("ince4b", ince4a, 512, 160, 112, 224, 24, 64,
64)
ince4c = self.inception("ince4c", ince4b, 512, 128, 128, 256, 24, 64,
64)
ince4d = self.inception("ince4d", ince4c, 512, 112, 144, 288, 32, 64,
64)
ince4e = self.inception("ince4e", ince4d, 528, 256, 160, 320, 32, 128,
128)
pool4 = fluid.layers.pool2d(
input=ince4e, pool_size=3, pool_type='max', pool_stride=2)
ince5a = self.inception("ince5a", pool4, 832, 256, 160, 320, 32, 128,
128)
ince5b = self.inception("ince5b", ince5a, 832, 384, 192, 384, 48, 128,
128)
pool5 = fluid.layers.pool2d(
input=ince5b, pool_size=7, pool_type='avg', pool_stride=7)
dropout = fluid.layers.dropout(x=pool5, dropout_prob=0.4)
out = fluid.layers.fc(input=dropout,
size=class_dim,
act='softmax',
param_attr=self.xavier(1024, 1))
pool_o1 = fluid.layers.pool2d(
input=ince4a, pool_size=5, pool_type='avg', pool_stride=3)
conv_o1 = self.conv_layer(
input=pool_o1, num_filters=128, filter_size=1, stride=1, act=None)
fc_o1 = fluid.layers.fc(input=conv_o1,
size=1024,
act='relu',
param_attr=self.xavier(2048, 1))
dropout_o1 = fluid.layers.dropout(x=fc_o1, dropout_prob=0.7)
out1 = fluid.layers.fc(input=dropout_o1,
size=class_dim,
act='softmax',
param_attr=self.xavier(1024, 1))
pool_o2 = fluid.layers.pool2d(
input=ince4d, pool_size=5, pool_type='avg', pool_stride=3)
conv_o2 = self.conv_layer(
input=pool_o2, num_filters=128, filter_size=1, stride=1, act=None)
fc_o2 = fluid.layers.fc(input=conv_o2,
size=1024,
act='relu',
param_attr=self.xavier(2048, 1))
dropout_o2 = fluid.layers.dropout(x=fc_o2, dropout_prob=0.7)
out2 = fluid.layers.fc(input=dropout_o2,
size=class_dim,
act='softmax',
param_attr=self.xavier(1024, 1))
# last fc layer is "out"
return out, out1, out2
import paddle
import paddle.fluid as fluid
import math
__all__ = ['InceptionV4']
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class InceptionV4():
def __init__(self):
self.params = train_parameters
def net(self, input, class_dim=1000):
x = self.inception_stem(input)
for i in range(4):
x = self.inceptionA(x)
x = self.reductionA(x)
for i in range(7):
x = self.inceptionB(x)
x = self.reductionB(x)
for i in range(3):
x = self.inceptionC(x)
pool = fluid.layers.pool2d(
input=x, pool_size=8, pool_type='avg', global_pooling=True)
drop = fluid.layers.dropout(x=pool, dropout_prob=0.2)
stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0)
out = fluid.layers.fc(
input=drop,
size=class_dim,
act='softmax',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv, stdv)))
return out
def conv_bn_layer(self,
data,
num_filters,
filter_size,
stride=1,
padding=0,
groups=1,
act='relu'):
conv = fluid.layers.conv2d(
input=data,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=padding,
groups=groups,
act=None,
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def inception_stem(self, data):
conv = self.conv_bn_layer(data, 32, 3, stride=2, act='relu')
conv = self.conv_bn_layer(conv, 32, 3, act='relu')
conv = self.conv_bn_layer(conv, 64, 3, padding=1, act='relu')
pool1 = fluid.layers.pool2d(
input=conv, pool_size=3, pool_stride=2, pool_type='max')
conv2 = self.conv_bn_layer(conv, 96, 3, stride=2, act='relu')
concat = fluid.layers.concat([pool1, conv2], axis=1)
conv1 = self.conv_bn_layer(concat, 64, 1, act='relu')
conv1 = self.conv_bn_layer(conv1, 96, 3, act='relu')
conv2 = self.conv_bn_layer(concat, 64, 1, act='relu')
conv2 = self.conv_bn_layer(
conv2, 64, (7, 1), padding=(3, 0), act='relu')
conv2 = self.conv_bn_layer(
conv2, 64, (1, 7), padding=(0, 3), act='relu')
conv2 = self.conv_bn_layer(conv2, 96, 3, act='relu')
concat = fluid.layers.concat([conv1, conv2], axis=1)
conv1 = self.conv_bn_layer(concat, 192, 3, stride=2, act='relu')
pool1 = fluid.layers.pool2d(
input=concat, pool_size=3, pool_stride=2, pool_type='max')
concat = fluid.layers.concat([conv1, pool1], axis=1)
return concat
def inceptionA(self, data):
pool1 = fluid.layers.pool2d(
input=data, pool_size=3, pool_padding=1, pool_type='avg')
conv1 = self.conv_bn_layer(pool1, 96, 1, act='relu')
conv2 = self.conv_bn_layer(data, 96, 1, act='relu')
conv3 = self.conv_bn_layer(data, 64, 1, act='relu')
conv3 = self.conv_bn_layer(conv3, 96, 3, padding=1, act='relu')
conv4 = self.conv_bn_layer(data, 64, 1, act='relu')
conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu')
conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu')
concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1)
return concat
def reductionA(self, data):
pool1 = fluid.layers.pool2d(
input=data, pool_size=3, pool_stride=2, pool_type='max')
conv2 = self.conv_bn_layer(data, 384, 3, stride=2, act='relu')
conv3 = self.conv_bn_layer(data, 192, 1, act='relu')
conv3 = self.conv_bn_layer(conv3, 224, 3, padding=1, act='relu')
conv3 = self.conv_bn_layer(conv3, 256, 3, stride=2, act='relu')
concat = fluid.layers.concat([pool1, conv2, conv3], axis=1)
return concat
def inceptionB(self, data):
pool1 = fluid.layers.pool2d(
input=data, pool_size=3, pool_padding=1, pool_type='avg')
conv1 = self.conv_bn_layer(pool1, 128, 1, act='relu')
conv2 = self.conv_bn_layer(data, 384, 1, act='relu')
conv3 = self.conv_bn_layer(data, 192, 1, act='relu')
conv3 = self.conv_bn_layer(
conv3, 224, (1, 7), padding=(0, 3), act='relu')
conv3 = self.conv_bn_layer(
conv3, 256, (7, 1), padding=(3, 0), act='relu')
conv4 = self.conv_bn_layer(data, 192, 1, act='relu')
conv4 = self.conv_bn_layer(
conv4, 192, (1, 7), padding=(0, 3), act='relu')
conv4 = self.conv_bn_layer(
conv4, 224, (7, 1), padding=(3, 0), act='relu')
conv4 = self.conv_bn_layer(
conv4, 224, (1, 7), padding=(0, 3), act='relu')
conv4 = self.conv_bn_layer(
conv4, 256, (7, 1), padding=(3, 0), act='relu')
concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1)
return concat
def reductionB(self, data):
pool1 = fluid.layers.pool2d(
input=data, pool_size=3, pool_stride=2, pool_type='max')
conv2 = self.conv_bn_layer(data, 192, 1, act='relu')
conv2 = self.conv_bn_layer(conv2, 192, 3, stride=2, act='relu')
conv3 = self.conv_bn_layer(data, 256, 1, act='relu')
conv3 = self.conv_bn_layer(
conv3, 256, (1, 7), padding=(0, 3), act='relu')
conv3 = self.conv_bn_layer(
conv3, 320, (7, 1), padding=(3, 0), act='relu')
conv3 = self.conv_bn_layer(conv3, 320, 3, stride=2, act='relu')
concat = fluid.layers.concat([pool1, conv2, conv3], axis=1)
return concat
def inceptionC(self, data):
pool1 = fluid.layers.pool2d(
input=data, pool_size=3, pool_padding=1, pool_type='avg')
conv1 = self.conv_bn_layer(pool1, 256, 1, act='relu')
conv2 = self.conv_bn_layer(data, 256, 1, act='relu')
conv3 = self.conv_bn_layer(data, 384, 1, act='relu')
conv3_1 = self.conv_bn_layer(
conv3, 256, (1, 3), padding=(0, 1), act='relu')
conv3_2 = self.conv_bn_layer(
conv3, 256, (3, 1), padding=(1, 0), act='relu')
conv4 = self.conv_bn_layer(data, 384, 1, act='relu')
conv4 = self.conv_bn_layer(
conv4, 448, (1, 3), padding=(0, 1), act='relu')
conv4 = self.conv_bn_layer(
conv4, 512, (3, 1), padding=(1, 0), act='relu')
conv4_1 = self.conv_bn_layer(
conv4, 256, (1, 3), padding=(0, 1), act='relu')
conv4_2 = self.conv_bn_layer(
conv4, 256, (3, 1), padding=(1, 0), act='relu')
concat = fluid.layers.concat(
[conv1, conv2, conv3_1, conv3_2, conv4_1, conv4_2], axis=1)
return concat
import paddle
import paddle.fluid as fluid
import paddle.fluid.layers.ops as ops
from paddle.fluid.initializer import init_on_cpu
from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
import math
def cosine_decay(learning_rate, step_each_epoch, epochs=120):
"""Applies cosine decay to the learning rate.
lr = 0.05 * (math.cos(epoch * (math.pi / 120)) + 1)
"""
global_step = _decay_step_counter()
with init_on_cpu():
epoch = ops.floor(global_step / step_each_epoch)
decayed_lr = learning_rate * \
(ops.cos(epoch * (math.pi / epochs)) + 1)/2
return decayed_lr
import paddle.fluid as fluid
from paddle.fluid.initializer import MSRA
from paddle.fluid.param_attr import ParamAttr
__all__ = ['MobileNet']
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class MobileNet():
def __init__(self):
self.params = train_parameters
def net(self, input, class_dim=1000, scale=1.0):
# conv1: 112x112
input = self.conv_bn_layer(
input,
filter_size=3,
channels=3,
num_filters=int(32 * scale),
stride=2,
padding=1)
# 56x56
input = self.depthwise_separable(
input,
num_filters1=32,
num_filters2=64,
num_groups=32,
stride=1,
scale=scale)
input = self.depthwise_separable(
input,
num_filters1=64,
num_filters2=128,
num_groups=64,
stride=2,
scale=scale)
# 28x28
input = self.depthwise_separable(
input,
num_filters1=128,
num_filters2=128,
num_groups=128,
stride=1,
scale=scale)
input = self.depthwise_separable(
input,
num_filters1=128,
num_filters2=256,
num_groups=128,
stride=2,
scale=scale)
# 14x14
input = self.depthwise_separable(
input,
num_filters1=256,
num_filters2=256,
num_groups=256,
stride=1,
scale=scale)
input = self.depthwise_separable(
input,
num_filters1=256,
num_filters2=512,
num_groups=256,
stride=2,
scale=scale)
# 14x14
for i in range(5):
input = self.depthwise_separable(
input,
num_filters1=512,
num_filters2=512,
num_groups=512,
stride=1,
scale=scale)
# 7x7
input = self.depthwise_separable(
input,
num_filters1=512,
num_filters2=1024,
num_groups=512,
stride=2,
scale=scale)
input = self.depthwise_separable(
input,
num_filters1=1024,
num_filters2=1024,
num_groups=1024,
stride=1,
scale=scale)
input = fluid.layers.pool2d(
input=input,
pool_size=0,
pool_stride=1,
pool_type='avg',
global_pooling=True)
output = fluid.layers.fc(input=input,
size=class_dim,
act='softmax',
param_attr=ParamAttr(initializer=MSRA()))
return output
def conv_bn_layer(self,
input,
filter_size,
num_filters,
stride,
padding,
channels=None,
num_groups=1,
act='relu',
use_cudnn=True):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=padding,
groups=num_groups,
act=None,
use_cudnn=use_cudnn,
param_attr=ParamAttr(initializer=MSRA()),
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def depthwise_separable(self, input, num_filters1, num_filters2, num_groups,
stride, scale):
depthwise_conv = self.conv_bn_layer(
input=input,
filter_size=3,
num_filters=int(num_filters1 * scale),
stride=stride,
padding=1,
num_groups=int(num_groups * scale),
use_cudnn=False)
pointwise_conv = self.conv_bn_layer(
input=depthwise_conv,
filter_size=1,
num_filters=int(num_filters2 * scale),
stride=1,
padding=0)
return pointwise_conv
import paddle
import paddle.fluid as fluid
import math
__all__ = ["ResNet", "ResNet50", "ResNet101", "ResNet152"]
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class ResNet():
def __init__(self, layers=50):
self.params = train_parameters
self.layers = layers
def net(self, input, class_dim=1000):
layers = self.layers
supported_layers = [50, 101, 152]
assert layers in supported_layers, \
"supported layers are {} but input layer is {}".format(supported_layers, layers)
if layers == 50:
depth = [3, 4, 6, 3]
elif layers == 101:
depth = [3, 4, 23, 3]
elif layers == 152:
depth = [3, 8, 36, 3]
num_filters = [64, 128, 256, 512]
conv = self.conv_bn_layer(
input=input, num_filters=64, filter_size=7, stride=2, act='relu')
conv = fluid.layers.pool2d(
input=conv,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
for block in range(len(depth)):
for i in range(depth[block]):
conv = self.bottleneck_block(
input=conv,
num_filters=num_filters[block],
stride=2 if i == 0 and block != 0 else 1)
pool = fluid.layers.pool2d(
input=conv, pool_size=7, pool_type='avg', global_pooling=True)
stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
out = fluid.layers.fc(input=pool,
size=class_dim,
act='softmax',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv,
stdv)))
return out
def conv_bn_layer(self,
input,
num_filters,
filter_size,
stride=1,
groups=1,
act=None):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=(filter_size - 1) / 2,
groups=groups,
act=None,
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def shortcut(self, input, ch_out, stride):
ch_in = input.shape[1]
if ch_in != ch_out or stride != 1:
return self.conv_bn_layer(input, ch_out, 1, stride)
else:
return input
def bottleneck_block(self, input, num_filters, stride):
conv0 = self.conv_bn_layer(
input=input, num_filters=num_filters, filter_size=1, act='relu')
conv1 = self.conv_bn_layer(
input=conv0,
num_filters=num_filters,
filter_size=3,
stride=stride,
act='relu')
conv2 = self.conv_bn_layer(
input=conv1, num_filters=num_filters * 4, filter_size=1, act=None)
short = self.shortcut(input, num_filters * 4, stride)
return fluid.layers.elementwise_add(x=short, y=conv2, act='relu')
def ResNet50():
model = ResNet(layers=50)
return model
def ResNet101():
model = ResNet(layers=101)
return model
def ResNet152():
model = ResNet(layers=152)
return model
import paddle
import paddle.fluid as fluid
import math
__all__ = [
"SE_ResNeXt", "SE_ResNeXt50_32x4d", "SE_ResNeXt101_32x4d",
"SE_ResNeXt152_32x4d"
]
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class SE_ResNeXt():
def __init__(self, layers=50):
self.params = train_parameters
self.layers = layers
def net(self, input, class_dim=1000):
layers = self.layers
supported_layers = [50, 101, 152]
assert layers in supported_layers, \
"supported layers are {} but input layer is {}".format(supported_layers, layers)
if layers == 50:
cardinality = 32
reduction_ratio = 16
depth = [3, 4, 6, 3]
num_filters = [128, 256, 512, 1024]
conv = self.conv_bn_layer(
input=input,
num_filters=64,
filter_size=7,
stride=2,
act='relu')
conv = fluid.layers.pool2d(
input=conv,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
elif layers == 101:
cardinality = 32
reduction_ratio = 16
depth = [3, 4, 23, 3]
num_filters = [128, 256, 512, 1024]
conv = self.conv_bn_layer(
input=input,
num_filters=64,
filter_size=7,
stride=2,
act='relu')
conv = fluid.layers.pool2d(
input=conv,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
elif layers == 152:
cardinality = 64
reduction_ratio = 16
depth = [3, 8, 36, 3]
num_filters = [128, 256, 512, 1024]
conv = self.conv_bn_layer(
input=input,
num_filters=64,
filter_size=3,
stride=2,
act='relu')
conv = self.conv_bn_layer(
input=conv, num_filters=64, filter_size=3, stride=1, act='relu')
conv = self.conv_bn_layer(
input=conv,
num_filters=128,
filter_size=3,
stride=1,
act='relu')
conv = fluid.layers.pool2d(
input=conv, pool_size=3, pool_stride=2, pool_padding=1, \
pool_type='max')
for block in range(len(depth)):
for i in range(depth[block]):
conv = self.bottleneck_block(
input=conv,
num_filters=num_filters[block],
stride=2 if i == 0 and block != 0 else 1,
cardinality=cardinality,
reduction_ratio=reduction_ratio)
pool = fluid.layers.pool2d(
input=conv, pool_size=7, pool_type='avg', global_pooling=True)
drop = fluid.layers.dropout(x=pool, dropout_prob=0.5)
stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0)
out = fluid.layers.fc(input=drop,
size=class_dim,
act='softmax',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv,
stdv)))
return out
def shortcut(self, input, ch_out, stride):
ch_in = input.shape[1]
if ch_in != ch_out or stride != 1:
filter_size = 1
return self.conv_bn_layer(input, ch_out, filter_size, stride)
else:
return input
def bottleneck_block(self, input, num_filters, stride, cardinality,
reduction_ratio):
conv0 = self.conv_bn_layer(
input=input, num_filters=num_filters, filter_size=1, act='relu')
conv1 = self.conv_bn_layer(
input=conv0,
num_filters=num_filters,
filter_size=3,
stride=stride,
groups=cardinality,
act='relu')
conv2 = self.conv_bn_layer(
input=conv1, num_filters=num_filters * 2, filter_size=1, act=None)
scale = self.squeeze_excitation(
input=conv2,
num_channels=num_filters * 2,
reduction_ratio=reduction_ratio)
short = self.shortcut(input, num_filters * 2, stride)
return fluid.layers.elementwise_add(x=short, y=scale, act='relu')
def conv_bn_layer(self,
input,
num_filters,
filter_size,
stride=1,
groups=1,
act=None):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=(filter_size - 1) / 2,
groups=groups,
act=None,
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def squeeze_excitation(self, input, num_channels, reduction_ratio):
pool = fluid.layers.pool2d(
input=input, pool_size=0, pool_type='avg', global_pooling=True)
stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
squeeze = fluid.layers.fc(input=pool,
size=num_channels / reduction_ratio,
act='relu',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(
-stdv, stdv)))
stdv = 1.0 / math.sqrt(squeeze.shape[1] * 1.0)
excitation = fluid.layers.fc(input=squeeze,
size=num_channels,
act='sigmoid',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(
-stdv, stdv)))
scale = fluid.layers.elementwise_mul(x=input, y=excitation, axis=0)
return scale
def SE_ResNeXt50_32x4d():
model = SE_ResNeXt(layers=50)
return model
def SE_ResNeXt101_32x4d():
model = SE_ResNeXt(layers=101)
return model
def SE_ResNeXt152_32x4d():
model = SE_ResNeXt(layers=152)
return model
import paddle
import paddle.fluid as fluid
__all__ = ["VGGNet", "VGG11", "VGG13", "VGG16", "VGG19"]
train_parameters = {
"input_size": [3, 224, 224],
"input_mean": [0.485, 0.456, 0.406],
"input_std": [0.229, 0.224, 0.225],
"learning_strategy": {
"name": "piecewise_decay",
"batch_size": 256,
"epochs": [30, 60, 90],
"steps": [0.1, 0.01, 0.001, 0.0001]
}
}
class VGGNet():
def __init__(self, layers=16):
self.params = train_parameters
self.layers = layers
def net(self, input, class_dim=1000):
layers = self.layers
vgg_spec = {
11: ([1, 1, 2, 2, 2]),
13: ([2, 2, 2, 2, 2]),
16: ([2, 2, 3, 3, 3]),
19: ([2, 2, 4, 4, 4])
}
assert layers in vgg_spec.keys(), \
"supported layers are {} but input layer is {}".format(vgg_spec.keys(), layers)
nums = vgg_spec[layers]
conv1 = self.conv_block(input, 64, nums[0])
conv2 = self.conv_block(conv1, 128, nums[1])
conv3 = self.conv_block(conv2, 256, nums[2])
conv4 = self.conv_block(conv3, 512, nums[3])
conv5 = self.conv_block(conv4, 512, nums[4])
fc_dim = 4096
fc1 = fluid.layers.fc(
input=conv5,
size=fc_dim,
act='relu',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Normal(scale=0.005)),
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Constant(value=0.1)))
fc1 = fluid.layers.dropout(x=fc1, dropout_prob=0.5)
fc2 = fluid.layers.fc(
input=fc1,
size=fc_dim,
act='relu',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Normal(scale=0.005)),
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Constant(value=0.1)))
fc2 = fluid.layers.dropout(x=fc2, dropout_prob=0.5)
out = fluid.layers.fc(
input=fc2,
size=class_dim,
act='softmax',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Normal(scale=0.005)),
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Constant(value=0.1)))
return out
def conv_block(self, input, num_filter, groups):
conv = input
for i in range(groups):
conv = fluid.layers.conv2d(
input=conv,
num_filters=num_filter,
filter_size=3,
stride=1,
padding=1,
act='relu',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Normal(scale=0.01)),
bias_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Constant(value=0.0)))
return fluid.layers.pool2d(
input=conv, pool_size=2, pool_type='max', pool_stride=2)
def VGG11():
model = VGGNet(layers=11)
return model
def VGG13():
model = VGGNet(layers=13)
return model
def VGG16():
model = VGGNet(layers=16)
return model
def VGG19():
model = VGGNet(layers=19)
return model
...@@ -11,7 +11,7 @@ random.seed(0) ...@@ -11,7 +11,7 @@ random.seed(0)
DATA_DIM = 224 DATA_DIM = 224
THREAD = 8 THREAD = 8
BUF_SIZE = 1024 BUF_SIZE = 102400
DATA_DIR = 'data/ILSVRC2012' DATA_DIR = 'data/ILSVRC2012'
TRAIN_LIST = 'data/ILSVRC2012/train_list.txt' TRAIN_LIST = 'data/ILSVRC2012/train_list.txt'
...@@ -105,7 +105,7 @@ def process_image(sample, mode, color_jitter, rotate): ...@@ -105,7 +105,7 @@ def process_image(sample, mode, color_jitter, rotate):
if rotate: img = rotate_image(img) if rotate: img = rotate_image(img)
img = random_crop(img, DATA_DIM) img = random_crop(img, DATA_DIM)
else: else:
img = resize_short(img, DATA_DIM) img = resize_short(img, target_size=256)
img = crop_image(img, target_size=DATA_DIM, center=True) img = crop_image(img, target_size=DATA_DIM, center=True)
if mode == 'train': if mode == 'train':
if color_jitter: if color_jitter:
...@@ -120,9 +120,9 @@ def process_image(sample, mode, color_jitter, rotate): ...@@ -120,9 +120,9 @@ def process_image(sample, mode, color_jitter, rotate):
img -= img_mean img -= img_mean
img /= img_std img /= img_std
if mode == 'train' or mode == 'test': if mode == 'train' or mode == 'val':
return img, sample[1] return img, sample[1]
elif mode == 'infer': elif mode == 'test':
return [img] return [img]
...@@ -137,11 +137,11 @@ def _reader_creator(file_list, ...@@ -137,11 +137,11 @@ def _reader_creator(file_list,
if shuffle: if shuffle:
random.shuffle(lines) random.shuffle(lines)
for line in lines: for line in lines:
if mode == 'train' or mode == 'test': if mode == 'train' or mode == 'val':
img_path, label = line.split() img_path, label = line.split()
img_path = os.path.join(DATA_DIR, img_path) img_path = os.path.join(DATA_DIR, img_path)
yield img_path, int(label) yield img_path, int(label)
elif mode == 'infer': elif mode == 'test':
img_path = os.path.join(DATA_DIR, line) img_path = os.path.join(DATA_DIR, line)
yield [img_path] yield [img_path]
...@@ -156,9 +156,9 @@ def train(file_list=TRAIN_LIST): ...@@ -156,9 +156,9 @@ def train(file_list=TRAIN_LIST):
file_list, 'train', shuffle=True, color_jitter=False, rotate=False) file_list, 'train', shuffle=True, color_jitter=False, rotate=False)
def test(file_list=TEST_LIST): def val(file_list=TEST_LIST):
return _reader_creator(file_list, 'test', shuffle=False) return _reader_creator(file_list, 'val', shuffle=False)
def infer(file_list): def test(file_list):
return _reader_creator(file_list, 'infer', shuffle=False) return _reader_creator(file_list, 'test', shuffle=False)
import os
import numpy as np
import time
import sys
import paddle
import paddle.fluid as fluid
import reader
import paddle.fluid.layers.control_flow as control_flow
import paddle.fluid.layers.nn as nn
import paddle.fluid.layers.tensor as tensor
import math
def conv_bn_layer(input, num_filters, filter_size, stride=1, groups=1,
act=None):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=(filter_size - 1) / 2,
groups=groups,
act=None,
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def squeeze_excitation(input, num_channels, reduction_ratio):
pool = fluid.layers.pool2d(
input=input, pool_size=0, pool_type='avg', global_pooling=True)
stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
squeeze = fluid.layers.fc(input=pool,
size=num_channels / reduction_ratio,
act='relu',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv,
stdv)))
stdv = 1.0 / math.sqrt(squeeze.shape[1] * 1.0)
excitation = fluid.layers.fc(input=squeeze,
size=num_channels,
act='sigmoid',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(
-stdv, stdv)))
scale = fluid.layers.elementwise_mul(x=input, y=excitation, axis=0)
return scale
def shortcut(input, ch_out, stride):
ch_in = input.shape[1]
if ch_in != ch_out or stride != 1:
filter_size = 1
return conv_bn_layer(input, ch_out, filter_size, stride)
else:
return input
def bottleneck_block(input, num_filters, stride, cardinality, reduction_ratio):
conv0 = conv_bn_layer(
input=input, num_filters=num_filters, filter_size=1, act='relu')
conv1 = conv_bn_layer(
input=conv0,
num_filters=num_filters,
filter_size=3,
stride=stride,
groups=cardinality,
act='relu')
conv2 = conv_bn_layer(
input=conv1, num_filters=num_filters * 2, filter_size=1, act=None)
scale = squeeze_excitation(
input=conv2,
num_channels=num_filters * 2,
reduction_ratio=reduction_ratio)
short = shortcut(input, num_filters * 2, stride)
return fluid.layers.elementwise_add(x=short, y=scale, act='relu')
def SE_ResNeXt(input, class_dim, infer=False, layers=50):
supported_layers = [50, 152]
if layers not in supported_layers:
print("supported layers are", supported_layers, \
"but input layer is ", layers)
exit()
if layers == 50:
cardinality = 32
reduction_ratio = 16
depth = [3, 4, 6, 3]
num_filters = [128, 256, 512, 1024]
conv = conv_bn_layer(
input=input, num_filters=64, filter_size=7, stride=2, act='relu')
conv = fluid.layers.pool2d(
input=conv,
pool_size=3,
pool_stride=2,
pool_padding=1,
pool_type='max')
elif layers == 152:
cardinality = 64
reduction_ratio = 16
depth = [3, 8, 36, 3]
num_filters = [128, 256, 512, 1024]
conv = conv_bn_layer(
input=input, num_filters=64, filter_size=3, stride=2, act='relu')
conv = conv_bn_layer(
input=conv, num_filters=64, filter_size=3, stride=1, act='relu')
conv = conv_bn_layer(
input=conv, num_filters=128, filter_size=3, stride=1, act='relu')
conv = fluid.layers.pool2d(
input=conv, pool_size=3, pool_stride=2, pool_padding=1, \
pool_type='max')
for block in range(len(depth)):
for i in range(depth[block]):
conv = bottleneck_block(
input=conv,
num_filters=num_filters[block],
stride=2 if i == 0 and block != 0 else 1,
cardinality=cardinality,
reduction_ratio=reduction_ratio)
pool = fluid.layers.pool2d(
input=conv, pool_size=7, pool_type='avg', global_pooling=True)
if not infer:
drop = fluid.layers.dropout(x=pool, dropout_prob=0.5)
else:
drop = pool
stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0)
out = fluid.layers.fc(input=drop,
size=class_dim,
act='softmax',
param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv,
stdv)))
return out
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册