diff --git a/fluid/PaddleCV/image_classification/README.md b/fluid/PaddleCV/image_classification/README.md index 985d24a5ab92b6edb723cec2b5cc78bafaf19812..4f37d8f5b57aed073e1e9522380bb4a1e9181d61 100644 --- a/fluid/PaddleCV/image_classification/README.md +++ b/fluid/PaddleCV/image_classification/README.md @@ -37,8 +37,6 @@ In the shell script ```download_imagenet2012.sh```, there are three steps to pr train/n02483708/n02483708_2436.jpeg 369 train/n03998194/n03998194_7015.jpeg 741 train/n04523525/n04523525_38118.jpeg 884 -train/n04596742/n04596742_3032.jpeg 909 -train/n03208938/n03208938_7065.jpeg 535 ... ``` * *val_list.txt*: label file of imagenet-2012 validation set, with each line seperated by ```SPACE```, like. @@ -46,8 +44,6 @@ train/n03208938/n03208938_7065.jpeg 535 val/ILSVRC2012_val_00000001.jpeg 65 val/ILSVRC2012_val_00000002.jpeg 970 val/ILSVRC2012_val_00000003.jpeg 230 -val/ILSVRC2012_val_00000004.jpeg 809 -val/ILSVRC2012_val_00000005.jpeg 516 ... ``` @@ -84,15 +80,14 @@ python train.py \ * **pretrained_model**: model path for pretraining. Default: None. * **checkpoint**: the checkpoint path to resume. Default: None. * **data_dir**: the data path. Default: "./data/ILSVRC2012". -* **model_category**: the category of models, ("models"|"models_name"). Default: "models_name". -* **fp16**: whether to enable half precisioin training with fp16. Default: False. +* **fp16**: whether to enable half precision training with fp16. Default: False. * **scale_loss**: scale loss for fp16. Default: 1.0. * **l2_decay**: L2_decay parameter. Default: 1e-4. * **momentum_rate**: momentum_rate. Default: 0.9. Or can start the training step by running the ```run.sh```. -**data reader introduction:** Data reader is defined in ```reader.py``` and ```reader_cv2.py```, Using CV2 reader can improve the speed of reading. In [training stage](#training-a-model), random crop and flipping are used, while center crop is used in [evaluation](#inference) and [inference](#inference) stages. Supported data augmentation includes: +**data reader introduction:** Data reader is defined in ```reader.py```and```reader_cv2.py```, Using CV2 reader can improve the speed of reading. In [training stage](#training-a-model-with-flexible-parameters), random crop and flipping are used, while center crop is used in [Evaluation](#evaluation) and [Inference](#inference) stages. Supported data augmentation includes: * rotation * color jitter * random crop @@ -100,32 +95,11 @@ Or can start the training step by running the ```run.sh```. * resize * flipping -**training curve:** The training curve can be drawn based on training log. For example, the log from training AlexNet is like: -``` -End pass 1, train_loss 6.23153877258, train_acc1 0.0150696625933, train_acc5 0.0552518665791, test_loss 5.41981744766, test_acc1 0.0519132651389, test_acc5 0.156150355935 -End pass 2, train_loss 5.15442800522, train_acc1 0.0784279331565, train_acc5 0.211050540209, test_loss 4.45795249939, test_acc1 0.140469551086, test_acc5 0.333163291216 -End pass 3, train_loss 4.51505613327, train_acc1 0.145300447941, train_acc5 0.331567406654, test_loss 3.86548018456, test_acc1 0.219443559647, test_acc5 0.446448504925 -End pass 4, train_loss 4.12735557556, train_acc1 0.19437250495, train_acc5 0.405713528395, test_loss 3.56990146637, test_acc1 0.264536827803, test_acc5 0.507190704346 -End pass 5, train_loss 3.87505435944, train_acc1 0.229518383741, train_acc5 0.453582793474, test_loss 3.35345435143, test_acc1 0.297349333763, test_acc5 0.54753267765 -End pass 6, train_loss 3.6929500103, train_acc1 0.255628824234, train_acc5 0.487188398838, test_loss 3.17112898827, test_acc1 0.326953113079, test_acc5 0.581780135632 -End pass 7, train_loss 3.55882954597, train_acc1 0.275381118059, train_acc5 0.511990904808, test_loss 3.03736782074, test_acc1 0.349035382271, test_acc5 0.606293857098 -End pass 8, train_loss 3.45595097542, train_acc1 0.291462600231, train_acc5 0.530815005302, test_loss 2.96034455299, test_acc1 0.362228929996, test_acc5 0.617390751839 -End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545210540295, test_loss 2.93932366371, test_acc1 0.37129303813, test_acc5 0.623573005199 -... -``` - -The error rate curves of AlexNet, ResNet50 and SE-ResNeXt-50 are shown in the figure below. -

-
-Training and validation Curves -

- - ## Using Mixed-Precision Training -You may add `--fp16 1` to start train using mixed precisioin training, which the training process will use float16 and the output model ("master" parameters) is saved as float32. You also may need to pass `--scale_loss` to overcome accuracy issues, usually `--scale_loss 8.0` will do. +You may add `--fp16=1` to start train using mixed precisioin training, which the training process will use float16 and the output model ("master" parameters) is saved as float32. You also may need to pass `--scale_loss` to overcome accuracy issues, usually `--scale_loss=8.0` will do. -Note that currently `--fp16` can not use together with `--with_mem_opt`, so pass `--with_mem_opt 0` to disable memory optimization pass. +Note that currently `--fp16` can not use together with `--with_mem_opt`, so pass `--with_mem_opt=0` to disable memory optimization pass. ## Finetuning @@ -156,20 +130,6 @@ python eval.py \ --pretrained_model=${path_to_pretrain_model} ``` -According to the congfiguration of evaluation, the output log is like: -``` -Testbatch 0,loss 2.1786134243, acc1 0.625,acc5 0.8125,time 0.48 sec -Testbatch 10,loss 0.898496925831, acc1 0.75,acc5 0.9375,time 0.51 sec -Testbatch 20,loss 1.32524681091, acc1 0.6875,acc5 0.9375,time 0.37 sec -Testbatch 30,loss 1.46830511093, acc1 0.5,acc5 0.9375,time 0.51 sec -Testbatch 40,loss 1.12802267075, acc1 0.625,acc5 0.9375,time 0.35 sec -Testbatch 50,loss 0.881597697735, acc1 0.8125,acc5 1.0,time 0.32 sec -Testbatch 60,loss 0.300163716078, acc1 0.875,acc5 1.0,time 0.48 sec -Testbatch 70,loss 0.692037761211, acc1 0.875,acc5 1.0,time 0.35 sec -Testbatch 80,loss 0.0969972759485, acc1 1.0,acc5 1.0,time 0.41 sec -... -``` - ## Inference Inference is used to get prediction score or image features based on trained models. ``` @@ -180,30 +140,10 @@ python infer.py \ --with_mem_opt=True \ --pretrained_model=${path_to_pretrain_model} ``` -The output contains predication results, including maximum score (before softmax) and corresponding predicted label. -``` -Test-0-score: [13.168352], class [491] -Test-1-score: [7.913302], class [975] -Test-2-score: [16.959702], class [21] -Test-3-score: [14.197695], class [383] -Test-4-score: [12.607652], class [878] -Test-5-score: [17.725458], class [15] -Test-6-score: [12.678599], class [118] -Test-7-score: [12.353498], class [505] -Test-8-score: [20.828007], class [747] -Test-9-score: [15.135801], class [315] -Test-10-score: [14.585114], class [920] -Test-11-score: [13.739927], class [679] -Test-12-score: [15.040644], class [386] -... -``` ## Supported models and performances -Models consists of two categories: Models with specified parameters names in model definition and Models without specified parameters, Generate named model by indicating ```model_category = models_name```. - -Models are trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each pre-defined epoches, if not special introduced. Available top-1/top-5 validation accuracy on ImageNet 2012 are listed in table. Pretrained models can be downloaded by clicking related model names. - +Available top-1/top-5 validation accuracy on ImageNet 2012 are listed in table. Pretrained models can be downloaded by clicking related model names. - Released models: specify parameter names @@ -216,20 +156,12 @@ Models are trained by starting with learning rate ```0.1``` and decaying it by ` |[VGG19](https://paddle-imagenet-models-name.bj.bcebos.com/VGG19_pretrained.zip) | 72.56%/90.83% | 72.32%/90.98% | |[MobileNetV1](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.zip) | 70.91%/89.54% | 70.51%/89.35% | |[MobileNetV2](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.zip) | 71.90%/90.55% | 71.53%/90.41% | +|[ResNet18](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_pretrained.tar) | 70.85%/89.89% | 70.65%/89.89% | +|[ResNet34](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) | 74.41%/92.03% | 74.13%/91.97% | |[ResNet50](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.zip) | 76.35%/92.80% | 76.22%/92.92% | |[ResNet101](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.zip) | 77.49%/93.57% | 77.56%/93.64% | |[ResNet152](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet152_pretrained.zip) | 78.12%/93.93% | 77.92%/93.87% | |[SE_ResNeXt50_32x4d](https://paddle-imagenet-models-name.bj.bcebos.com/SE_ResNext50_32x4d_pretrained.zip) | 78.50%/94.01% | 78.44%/93.96% | |[SE_ResNeXt101_32x4d](https://paddle-imagenet-models-name.bj.bcebos.com/SE_ResNeXt101_32x4d_pretrained.zip) | 79.26%/94.22% | 79.12%/94.20% | - - - - -- Released models: not specify parameter names - -**NOTE: These are trained by using model_category=models** - -|model | top-1/top-5 accuracy(PIL)| top-1/top-5 accuracy(CV2) | -|- |:-: |:-:| -|[ResNet152](http://paddle-imagenet-models.bj.bcebos.com/ResNet152_pretrained.zip) | 78.18%/93.93% | 78.11%/94.04% | -|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.32%/93.96% | 77.58%/93.73% | +|[GoogleNet](https://paddle-imagenet-models-name.bj.bcebos.com/GoogleNet_pretrained.tar) | 70.50%/89.59% | 70.27%/89.58% | +|[ShuffleNetV2](https://paddle-imagenet-models-name.bj.bcebos.com/ShuffleNet_pretrained.tar) | | 69.48%/88.99% | diff --git a/fluid/PaddleCV/image_classification/README_cn.md b/fluid/PaddleCV/image_classification/README_cn.md index 803bdc99fc607105813f509d69119b3d117dc2e1..367aa5f8e204de4be0152ddc473394a405ba905c 100644 --- a/fluid/PaddleCV/image_classification/README_cn.md +++ b/fluid/PaddleCV/image_classification/README_cn.md @@ -1,20 +1,21 @@ # 图像分类以及模型库 -图像分类是计算机视觉的重要领域,它的目标是将图像分类到预定义的标签。近期,需要研究者提出很多不同种类的神经网络,并且极大的提升了分类算法的性能。本页将介绍如何使用PaddlePaddle进行图像分类,包括[数据准备](#data-preparation)、 [训练](#training-a-model)、[参数微调](#finetuning)、[模型评估](#evaluation)以及[模型推断](#inference)。 +图像分类是计算机视觉的重要领域,它的目标是将图像分类到预定义的标签。近期,许多研究者提出很多不同种类的神经网络,并且极大的提升了分类算法的性能。本页将介绍如何使用PaddlePaddle进行图像分类。 --- ## 内容 -- [安装](#installation) -- [数据准备](#data-preparation) -- [模型训练](#training-a-model) -- [参数微调](#finetuning) -- [模型评估](#evaluation) -- [模型推断](#inference) -- [已有模型及其性能](#supported-models) +- [安装](#安装) +- [数据准备](#数据准备) +- [模型训练](#模型训练) +- [混合精度训练](#混合精度训练) +- [参数微调](#参数微调) +- [模型评估](#模型评估) +- [模型预测](#模型预测) +- [已有模型及其性能](#已有模型及其性能) ## 安装 -在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据安装文档中的说明来更新PaddlePaddle。 +在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据 [installation document](http://paddlepaddle.org/documentation/docs/zh/1.3/beginners_guide/install/index_cn.html) 中的说明来更新PaddlePaddle。 ## 数据准备 @@ -36,8 +37,6 @@ sh download_imagenet2012.sh train/n02483708/n02483708_2436.jpeg 369 train/n03998194/n03998194_7015.jpeg 741 train/n04523525/n04523525_38118.jpeg 884 -train/n04596742/n04596742_3032.jpeg 909 -train/n03208938/n03208938_7065.jpeg 535 ... ``` * *val_list.txt*: ImageNet-2012验证集合的标签文件,每一行采用"空格"分隔图像路径与标注,例如: @@ -45,10 +44,9 @@ train/n03208938/n03208938_7065.jpeg 535 val/ILSVRC2012_val_00000001.jpeg 65 val/ILSVRC2012_val_00000002.jpeg 970 val/ILSVRC2012_val_00000003.jpeg 230 -val/ILSVRC2012_val_00000004.jpeg 809 -val/ILSVRC2012_val_00000005.jpeg 516 ... ``` +注意:需要根据本地环境调整reader.py相关路径来正确读取数据。 ## 模型训练 @@ -66,22 +64,27 @@ python train.py \ --lr=0.1 ``` **参数说明:** -* **model**: name model to use. Default: "SE_ResNeXt50_32x4d". -* **num_epochs**: the number of epochs. Default: 120. -* **batch_size**: the size of each mini-batch. Default: 256. -* **use_gpu**: whether to use GPU or not. Default: True. -* **total_images**: total number of images in the training set. Default: 1281167. -* **class_dim**: the class number of the classification task. Default: 1000. -* **image_shape**: input size of the network. Default: "3,224,224". -* **model_save_dir**: the directory to save trained model. Default: "output". -* **with_mem_opt**: whether to use memory optimization or not. Default: False. -* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay". -* **lr**: initialized learning rate. Default: 0.1. -* **pretrained_model**: model path for pretraining. Default: None. -* **checkpoint**: the checkpoint path to resume. Default: None. -* **model_category**: the category of models, ("models"|"models_name"). Default:"models_name". - -**数据读取器说明:** 数据读取器定义在```reader.py```和```reader_cv2.py```中, 一般, CV2 reader可以提高数据读取速度, reader(PIL)可以得到相对更高的精度, 在[训练阶段](#training-a-model), 默认采用的增广方式是随机裁剪与水平翻转, 而在[评估](#inference)与[推断](#inference)阶段用的默认方式是中心裁剪。当前支持的数据增广方式有: +* **model**: 模型名称, 默认值: "SE_ResNeXt50_32x4d" +* **num_epochs**: 训练回合数,默认值: 120 +* **batch_size**: 批大小,默认值: 256 +* **use_gpu**: 是否在GPU上运行,默认值: True +* **total_images**: 图片数,ImageNet2012默认值: 1281167. +* **class_dim**: 类别数,默认值: 1000 +* **image_shape**: 图片大小,默认值: "3,224,224" +* **model_save_dir**: 模型存储路径,默认值: "output/" +* **with_mem_opt**: 是否开启显存优化,默认值: False +* **lr_strategy**: 学习率变化策略,默认值: "piecewise_decay" +* **lr**: 初始学习率,默认值: 0.1 +* **pretrained_model**: 预训练模型路径,默认值: None +* **checkpoint**: 用于继续训练的检查点(指定具体模型存储路径,如"output/SE_ResNeXt50_32x4d/100/"),默认值: None +* **fp16**: 是否开启混合精度训练,默认值: False +* **scale_loss**: 调整混合训练的loss scale值,默认值: 1.0 +* **l2_decay**: l2_decay值,默认值: 1e-4 +* **momentum_rate**: momentum_rate值,默认值: 0.9 + +在```run.sh```中有用于训练的脚本. + +**数据读取器说明:** 数据读取器定义在```reader.py```和```reader_cv2.py```中。一般, CV2可以提高数据读取速度, PIL reader可以得到相对更高的精度, 我们现在默认基于PIL的数据读取器, 在[训练阶段](#模型训练), 默认采用的增广方式是随机裁剪与水平翻转, 而在[模型评估](#模型评估)与[模型预测](#模型预测)阶段用的默认方式是中心裁剪。当前支持的数据增广方式有: * 旋转 * 颜色抖动 * 随机裁剪 @@ -89,31 +92,11 @@ python train.py \ * 长宽调整 * 水平翻转 -**训练曲线:** 通过训练过程中的日志可以画出训练曲线。举个例子,训练AlexNet出来的日志如下所示: -``` -End pass 1, train_loss 6.23153877258, train_acc1 0.0150696625933, train_acc5 0.0552518665791, test_loss 5.41981744766, test_acc1 0.0519132651389, test_acc5 0.156150355935 -End pass 2, train_loss 5.15442800522, train_acc1 0.0784279331565, train_acc5 0.211050540209, test_loss 4.45795249939, test_acc1 0.140469551086, test_acc5 0.333163291216 -End pass 3, train_loss 4.51505613327, train_acc1 0.145300447941, train_acc5 0.331567406654, test_loss 3.86548018456, test_acc1 0.219443559647, test_acc5 0.446448504925 -End pass 4, train_loss 4.12735557556, train_acc1 0.19437250495, train_acc5 0.405713528395, test_loss 3.56990146637, test_acc1 0.264536827803, test_acc5 0.507190704346 -End pass 5, train_loss 3.87505435944, train_acc1 0.229518383741, train_acc5 0.453582793474, test_loss 3.35345435143, test_acc1 0.297349333763, test_acc5 0.54753267765 -End pass 6, train_loss 3.6929500103, train_acc1 0.255628824234, train_acc5 0.487188398838, test_loss 3.17112898827, test_acc1 0.326953113079, test_acc5 0.581780135632 -End pass 7, train_loss 3.55882954597, train_acc1 0.275381118059, train_acc5 0.511990904808, test_loss 3.03736782074, test_acc1 0.349035382271, test_acc5 0.606293857098 -End pass 8, train_loss 3.45595097542, train_acc1 0.291462600231, train_acc5 0.530815005302, test_loss 2.96034455299, test_acc1 0.362228929996, test_acc5 0.617390751839 -End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545210540295, test_loss 2.93932366371, test_acc1 0.37129303813, test_acc5 0.623573005199 -... -``` - -下图给出了AlexNet、ResNet50以及SE-ResNeXt-50网络的错误率曲线: -

-
-训练集合与验证集合上的错误率曲线 -

- ## 混合精度训练 -可以通过开启`--fp16 1`启动混合精度训练,这样训练过程会使用float16数据,并输出float32的模型参数("master"参数)。您可能需要同时传入`--scale_loss`来解决fp16训练的精度问题,通常传入`--scale_loss 8.0`即可。 +可以通过开启`--fp16=True`启动混合精度训练,这样训练过程会使用float16数据,并输出float32的模型参数("master"参数)。您可能需要同时传入`--scale_loss`来解决fp16训练的精度问题,通常传入`--scale_loss=8.0`即可。 -注意,目前混合精度训练不能和内存优化功能同时使用,所以需要传`--with_mem_opt 0`这个参数来禁用内存优化功能。 +注意,目前混合精度训练不能和内存优化功能同时使用,所以需要传`--with_mem_opt=False`这个参数来禁用内存优化功能。 ## 参数微调 @@ -133,7 +116,7 @@ python train.py ``` ## 模型评估 -模型评估是指对训练完毕的模型评估各类性能指标。用户可以下载[预训练模型](#supported-models)并且设置```path_to_pretrain_model```为模型所在路径。运行如下的命令,可以获得一个模型top-1/top-5精度: +模型评估是指对训练完毕的模型评估各类性能指标。用户可以下载[已有模型及其性能](#已有模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径。运行如下的命令,可以获得一个模型top-1/top-5精度: ``` python eval.py \ --model=SE_ResNeXt50_32x4d \ @@ -144,23 +127,8 @@ python eval.py \ --pretrained_model=${path_to_pretrain_model} ``` -根据这个评估程序的配置,输出日志形式如下: -``` -Testbatch 0,loss 2.1786134243, acc1 0.625,acc5 0.8125,time 0.48 sec -Testbatch 10,loss 0.898496925831, acc1 0.75,acc5 0.9375,time 0.51 sec -Testbatch 20,loss 1.32524681091, acc1 0.6875,acc5 0.9375,time 0.37 sec -Testbatch 30,loss 1.46830511093, acc1 0.5,acc5 0.9375,time 0.51 sec -Testbatch 40,loss 1.12802267075, acc1 0.625,acc5 0.9375,time 0.35 sec -Testbatch 50,loss 0.881597697735, acc1 0.8125,acc5 1.0,time 0.32 sec -Testbatch 60,loss 0.300163716078, acc1 0.875,acc5 1.0,time 0.48 sec -Testbatch 70,loss 0.692037761211, acc1 0.875,acc5 1.0,time 0.35 sec -Testbatch 80,loss 0.0969972759485, acc1 1.0,acc5 1.0,time 0.41 sec -... -``` - - -## 模型推断 -模型推断可以获取一个模型的预测分数或者图像的特征: +## 模型预测 +模型预测可以获取一个模型的预测分数或者图像的特征: ``` python infer.py \ --model=SE_ResNeXt50_32x4d \ @@ -169,31 +137,12 @@ python infer.py \ --with_mem_opt=True \ --pretrained_model=${path_to_pretrain_model} ``` -输出的预测结果包括最高分数(未经过softmax处理)以及相应的预测标签。 -``` -Test-0-score: [13.168352], class [491] -Test-1-score: [7.913302], class [975] -Test-2-score: [16.959702], class [21] -Test-3-score: [14.197695], class [383] -Test-4-score: [12.607652], class [878] -Test-5-score: [17.725458], class [15] -Test-6-score: [12.678599], class [118] -Test-7-score: [12.353498], class [505] -Test-8-score: [20.828007], class [747] -Test-9-score: [15.135801], class [315] -Test-10-score: [14.585114], class [920] -Test-11-score: [13.739927], class [679] -Test-12-score: [15.040644], class [386] -... -``` ## 已有模型及其性能 -Models包括两种模型:带有参数名字的模型,和不带有参数名字的模型。通过设置 ```model_category = models_name```来训练带有参数名字的模型。 - -表格中列出了在"models"目录下支持的神经网络种类,并且给出了已完成训练的模型在ImageNet-2012验证集合上的top-1/top-5精度;如无特征说明,训练模型的初始学习率为```0.1```,每隔预定的epochs会下降```0.1```。预训练模型可以通过点击相应模型的名称进行下载。 - +表格中列出了在```models```目录下支持的图像分类模型,并且给出了已完成训练的模型在ImageNet-2012验证集合上的top-1/top-5精度, +可以通过点击相应模型的名称下载相应预训练模型。 -- Released models: specify parameter names +- Released models: |model | top-1/top-5 accuracy(PIL)| top-1/top-5 accuracy(CV2) | |- |:-: |:-:| @@ -204,17 +153,12 @@ Models包括两种模型:带有参数名字的模型,和不带有参数名 |[VGG19](https://paddle-imagenet-models-name.bj.bcebos.com/VGG19_pretrained.zip) | 72.56%/90.83% | 72.32%/90.98% | |[MobileNetV1](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.zip) | 70.91%/89.54% | 70.51%/89.35% | |[MobileNetV2](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.zip) | 71.90%/90.55% | 71.53%/90.41% | +|[ResNet18](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_pretrained.tar) | 70.85%/89.89% | 70.65%/89.89% | +|[ResNet34](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) | 74.41%/92.03% | 74.13%/91.97% | |[ResNet50](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.zip) | 76.35%/92.80% | 76.22%/92.92% | |[ResNet101](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.zip) | 77.49%/93.57% | 77.56%/93.64% | |[ResNet152](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet152_pretrained.zip) | 78.12%/93.93% | 77.92%/93.87% | |[SE_ResNeXt50_32x4d](https://paddle-imagenet-models-name.bj.bcebos.com/SE_ResNext50_32x4d_pretrained.zip) | 78.50%/94.01% | 78.44%/93.96% | |[SE_ResNeXt101_32x4d](https://paddle-imagenet-models-name.bj.bcebos.com/SE_ResNeXt101_32x4d_pretrained.zip) | 79.26%/94.22% | 79.12%/94.20% | - -- Released models: not specify parameter names - -**注意:这是model_category = models 的预训练模型** - -|model | top-1/top-5 accuracy(PIL)| top-1/top-5 accuracy(CV2) | -|- |:-: |:-:| -|[ResNet152](http://paddle-imagenet-models.bj.bcebos.com/ResNet152_pretrained.zip) | 78.18%/93.93% | 78.11%/94.04% | -|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.32%/93.96% | 77.58%/93.73% | +|[GoogleNet](https://paddle-imagenet-models-name.bj.bcebos.com/GoogleNet_pretrained.tar) | 70.50%/89.59% | 70.27%/89.58% | +|[ShuffleNetV2](https://paddle-imagenet-models-name.bj.bcebos.com/ShuffleNet_pretrained.tar) | | 69.48%/88.99% | diff --git a/fluid/PaddleCV/image_classification/eval.py b/fluid/PaddleCV/image_classification/eval.py index 0660efe13750467ad6bf964b484c9db0ab44b1ee..3bcc4e696c4795a98408a9b1e0f4ff5d0303ec77 100644 --- a/fluid/PaddleCV/image_classification/eval.py +++ b/fluid/PaddleCV/image_classification/eval.py @@ -7,12 +7,12 @@ import time import sys import paddle import paddle.fluid as fluid -#import reader_cv2 as reader import reader as reader import argparse import functools +import models from utils.learning_rate import cosine_decay -from utility import add_arguments, print_arguments +from utils.utility import add_arguments, print_arguments import math parser = argparse.ArgumentParser(description=__doc__) @@ -25,22 +25,9 @@ add_arg('image_shape', str, "3,224,224", "Input image size") add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.") add_arg('pretrained_model', str, None, "Whether to use pretrained model.") add_arg('model', str, "SE_ResNeXt50_32x4d", "Set the network to use.") -add_arg('model_category', str, "models_name", "Whether to use models_name or not, valid value:'models','models_name'." ) # yapf: enable - -def set_models(model_category): - global models - assert model_category in ["models", "models_name" - ], "{} is not in lists: {}".format( - model_category, ["models", "models_name"]) - if model_category == "models_name": - import models_name as models - else: - import models as models - - def eval(args): # parameters from arguments class_dim = args.class_dim @@ -119,7 +106,7 @@ def eval(args): if batch_id % 10 == 0: print("Testbatch {0},loss {1}, " "acc1 {2},acc5 {3},time {4}".format(batch_id, \ - loss, acc1, acc5, \ + "%.5f"%loss,"%.5f"%acc1, "%.5f"%acc5, \ "%2.2f sec" % period)) sys.stdout.flush() @@ -128,14 +115,13 @@ def eval(args): test_acc5 = np.sum(test_info[2]) / cnt print("Test_loss {0}, test_acc1 {1}, test_acc5 {2}".format( - test_loss, test_acc1, test_acc5)) + "%.5f"%test_loss, "%.5f"%test_acc1, "%.5f"%test_acc5)) sys.stdout.flush() def main(): args = parser.parse_args() print_arguments(args) - set_models(args.model_category) eval(args) diff --git a/fluid/PaddleCV/image_classification/infer.py b/fluid/PaddleCV/image_classification/infer.py index 88ccf42912b67035895cd81f5f982edca1bd0a3e..e6e126f259c15dfe62acba167a04fe5caed58ad7 100644 --- a/fluid/PaddleCV/image_classification/infer.py +++ b/fluid/PaddleCV/image_classification/infer.py @@ -10,7 +10,9 @@ import paddle.fluid as fluid import reader import argparse import functools -from utility import add_arguments, print_arguments +import models +import utils +from utils.utility import add_arguments,print_arguments import math parser = argparse.ArgumentParser(description=__doc__) @@ -22,25 +24,14 @@ add_arg('image_shape', str, "3,224,224", "Input image size") add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.") add_arg('pretrained_model', str, None, "Whether to use pretrained model.") add_arg('model', str, "SE_ResNeXt50_32x4d", "Set the network to use.") -add_arg('model_category', str, "models_name", "Whether to use models_name or not, valid value:'models','models_name'." ) +add_arg('save_inference', bool, False, "Whether to save inference model or not") # yapf: enable - -def set_models(model_category): - global models - assert model_category in ["models", "models_name" - ], "{} is not in lists: {}".format( - model_category, ["models", "models_name"]) - if model_category == "models_name": - import models_name as models - else: - import models as models - - def infer(args): # parameters from arguments class_dim = args.class_dim model_name = args.model + save_inference = args.save_inference pretrained_model = args.pretrained_model with_memory_optimization = args.with_mem_opt image_shape = [int(m) for m in args.image_shape.split(",")] @@ -52,7 +43,7 @@ def infer(args): # model definition model = models.__dict__[model_name]() - if model_name is "GoogleNet": + if model_name == "GoogleNet": out, _, _ = model.net(input=image, class_dim=class_dim) else: out = model.net(input=image, class_dim=class_dim) @@ -60,7 +51,7 @@ def infer(args): test_program = fluid.default_main_program().clone(for_test=True) fetch_list = [out.name] - if with_memory_optimization: + if with_memory_optimization and not save_inference: fluid.memory_optimize( fluid.default_main_program(), skip_opt_set=set(fetch_list)) @@ -74,7 +65,17 @@ def infer(args): return os.path.exists(os.path.join(pretrained_model, var.name)) fluid.io.load_vars(exe, pretrained_model, predicate=if_exist) - + if save_inference: + fluid.io.save_inference_model( + dirname=model_name, + feeded_var_names=['image'], + main_program=test_program, + target_vars=out, + executor=exe, + model_filename='model', + params_filename='params') + print("model: ",model_name," is already saved") + exit(0) test_batch_size = 1 test_reader = paddle.batch(reader.test(), batch_size=test_batch_size) feeder = fluid.DataFeeder(place=place, feed_list=[image]) @@ -94,7 +95,6 @@ def infer(args): def main(): args = parser.parse_args() print_arguments(args) - set_models(args.model_category) infer(args) diff --git a/fluid/PaddleCV/image_classification/legacy/README.md b/fluid/PaddleCV/image_classification/legacy/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e4603a62da2bc15e0f834a34d403e94b332fd41b --- /dev/null +++ b/fluid/PaddleCV/image_classification/legacy/README.md @@ -0,0 +1,10 @@ +For historical reasons, We keep "no name" models here, which are different from "specified name" models. + +**NOTE: Training the models in legacy folder will generate models without specified parameters.** + +- **Released models: not specify parameter names** + +|model | top-1/top-5 accuracy(PIL)| top-1/top-5 accuracy(CV2) | +|- |:-: |:-:| +|[ResNet152](http://paddle-imagenet-models.bj.bcebos.com/ResNet152_pretrained.zip) | 78.18%/93.93% | 78.11%/94.04% | +|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.32%/93.96% | 77.58%/93.73% | diff --git a/fluid/PaddleCV/image_classification/models_name/__init__.py b/fluid/PaddleCV/image_classification/legacy/models/__init__.py similarity index 86% rename from fluid/PaddleCV/image_classification/models_name/__init__.py rename to fluid/PaddleCV/image_classification/legacy/models/__init__.py index ea0216e0fac43235e1793f0d8964a306017af7da..9659029482ed6f51c9a63ac330ad9d8fd3b8b98f 100644 --- a/fluid/PaddleCV/image_classification/models_name/__init__.py +++ b/fluid/PaddleCV/image_classification/legacy/models/__init__.py @@ -4,7 +4,9 @@ from .mobilenet_v2 import MobileNetV2 from .googlenet import GoogleNet from .vgg import VGG11, VGG13, VGG16, VGG19 from .resnet import ResNet50, ResNet101, ResNet152 +from .resnet_dist import DistResNet from .inception_v4 import InceptionV4 from .se_resnext import SE_ResNeXt50_32x4d, SE_ResNeXt101_32x4d, SE_ResNeXt152_32x4d from .dpn import DPN68, DPN92, DPN98, DPN107, DPN131 from .shufflenet_v2 import ShuffleNetV2_x0_5, ShuffleNetV2_x1_0, ShuffleNetV2_x1_5, ShuffleNetV2_x2_0 +from .fast_imagenet import FastImageNet diff --git a/fluid/PaddleCV/image_classification/models_name/alexnet.py b/fluid/PaddleCV/image_classification/legacy/models/alexnet.py similarity index 83% rename from fluid/PaddleCV/image_classification/models_name/alexnet.py rename to fluid/PaddleCV/image_classification/legacy/models/alexnet.py index f063c4d6deb88905aaa5f8a5eba59903f58293e8..abe3b92965b1c16312e4ddf68809f6a4c93183fa 100644 --- a/fluid/PaddleCV/image_classification/models_name/alexnet.py +++ b/fluid/PaddleCV/image_classification/legacy/models/alexnet.py @@ -26,9 +26,6 @@ class AlexNet(): def net(self, input, class_dim=1000): stdv = 1.0 / math.sqrt(input.shape[1] * 11 * 11) - layer_name = [ - "conv1", "conv2", "conv3", "conv4", "conv5", "fc6", "fc7", "fc8" - ] conv1 = fluid.layers.conv2d( input=input, num_filters=64, @@ -38,11 +35,9 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[0] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[0] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) pool1 = fluid.layers.pool2d( input=conv1, pool_size=3, @@ -60,11 +55,9 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[1] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[1] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) pool2 = fluid.layers.pool2d( input=conv2, pool_size=3, @@ -82,11 +75,9 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[2] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[2] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) stdv = 1.0 / math.sqrt(conv3.shape[1] * 3 * 3) conv4 = fluid.layers.conv2d( @@ -98,11 +89,9 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[3] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[3] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) stdv = 1.0 / math.sqrt(conv4.shape[1] * 3 * 3) conv5 = fluid.layers.conv2d( @@ -114,11 +103,9 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[4] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[4] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) pool5 = fluid.layers.pool2d( input=conv5, pool_size=3, @@ -127,42 +114,36 @@ class AlexNet(): pool_type='max') drop6 = fluid.layers.dropout(x=pool5, dropout_prob=0.5) + stdv = 1.0 / math.sqrt(drop6.shape[1] * drop6.shape[2] * drop6.shape[3] * 1.0) - fc6 = fluid.layers.fc( input=drop6, size=4096, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[5] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[5] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) drop7 = fluid.layers.dropout(x=fc6, dropout_prob=0.5) - stdv = 1.0 / math.sqrt(drop7.shape[1] * 1.0) + stdv = 1.0 / math.sqrt(drop7.shape[1] * 1.0) fc7 = fluid.layers.fc( input=drop7, size=4096, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[6] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[6] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) stdv = 1.0 / math.sqrt(fc7.shape[1] * 1.0) out = fluid.layers.fc( input=fc7, size=class_dim, bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[7] + "_offset"), + initializer=fluid.initializer.Uniform(-stdv, stdv)), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=layer_name[7] + "_weights")) + initializer=fluid.initializer.Uniform(-stdv, stdv))) return out diff --git a/fluid/PaddleCV/image_classification/models_name/dpn.py b/fluid/PaddleCV/image_classification/legacy/models/dpn.py similarity index 69% rename from fluid/PaddleCV/image_classification/models_name/dpn.py rename to fluid/PaddleCV/image_classification/legacy/models/dpn.py index 7f759b3bb6bfa9c866e129ac93ab2c6a9cf4168c..316e96ac2cebd2dec4a60bf1748ed321fa651590 100644 --- a/fluid/PaddleCV/image_classification/models_name/dpn.py +++ b/fluid/PaddleCV/image_classification/legacy/models/dpn.py @@ -7,7 +7,6 @@ import time import sys import paddle.fluid as fluid import math -from paddle.fluid.param_attr import ParamAttr __all__ = ["DPN", "DPN68", "DPN92", "DPN98", "DPN107", "DPN131"] @@ -53,30 +52,16 @@ class DPN(object): padding=init_padding, groups=1, act=None, - bias_attr=False, - name="conv1", - param_attr=ParamAttr(name="conv1_weights"), ) - + bias_attr=False) conv1_x_1 = fluid.layers.batch_norm( - input=conv1_x_1, - act='relu', - is_test=False, - name="conv1_bn", - param_attr=ParamAttr(name='conv1_bn_scale'), - bias_attr=ParamAttr('conv1_bn_offset'), - moving_mean_name='conv1_bn_mean', - moving_variance_name='conv1_bn_variance', ) - + input=conv1_x_1, act='relu', is_test=False) convX_x_x = fluid.layers.pool2d( input=conv1_x_1, pool_size=3, pool_stride=2, pool_padding=1, - pool_type='max', - name="pool1") + pool_type='max') - #conv2 - conv5 - match_list, num = [], 0 for gc in range(4): bw = bws[gc] inc = inc_sec[gc] @@ -84,46 +69,32 @@ class DPN(object): if gc == 0: _type1 = 'proj' _type2 = 'normal' - match = 1 else: _type1 = 'down' _type2 = 'normal' - match = match + k_sec[gc - 1] - match_list.append(match) - - convX_x_x = self.dual_path_factory( - convX_x_x, R, R, bw, inc, G, _type1, name="dpn" + str(match)) + convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G, + _type1) for i_ly in range(2, k_sec[gc] + 1): - num += 1 - if num in match_list: - num += 1 - convX_x_x = self.dual_path_factory( - convX_x_x, R, R, bw, inc, G, _type2, name="dpn" + str(num)) + convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G, + _type2) conv5_x_x = fluid.layers.concat(convX_x_x, axis=1) conv5_x_x = fluid.layers.batch_norm( - input=conv5_x_x, - act='relu', - is_test=False, - name="final_concat_bn", - param_attr=ParamAttr(name='final_concat_bn_scale'), - bias_attr=ParamAttr('final_concat_bn_offset'), - moving_mean_name='final_concat_bn_mean', - moving_variance_name='final_concat_bn_variance', ) + input=conv5_x_x, act='relu', is_test=False) pool5 = fluid.layers.pool2d( input=conv5_x_x, pool_size=7, pool_stride=1, pool_padding=0, - pool_type='avg', ) + pool_type='avg') + #stdv = 1.0 / math.sqrt(pool5.shape[1] * 1.0) stdv = 0.01 param_attr = fluid.param_attr.ParamAttr( initializer=fluid.initializer.Uniform(-stdv, stdv)) fc6 = fluid.layers.fc(input=pool5, size=class_dim, - param_attr=param_attr, - name="fc6") + param_attr=param_attr) return fc6 @@ -201,8 +172,7 @@ class DPN(object): num_1x1_c, inc, G, - _type='normal', - name=None): + _type='normal'): kw = 3 kh = 3 pw = (kw - 1) // 2 @@ -231,50 +201,35 @@ class DPN(object): num_filter=(num_1x1_c + 2 * inc), kernel=(1, 1), pad=(0, 0), - stride=(key_stride, key_stride), - name=name + "_match") + stride=(key_stride, key_stride)) data_o1, data_o2 = fluid.layers.split( - c1x1_w, - num_or_sections=[num_1x1_c, 2 * inc], - dim=1, - name=name + "_match_conv_Slice") + c1x1_w, num_or_sections=[num_1x1_c, 2 * inc], dim=1) else: data_o1 = data[0] data_o2 = data[1] # MAIN c1x1_a = self.bn_ac_conv( - data=data_in, - num_filter=num_1x1_a, - kernel=(1, 1), - pad=(0, 0), - name=name + "_conv1") + data=data_in, num_filter=num_1x1_a, kernel=(1, 1), pad=(0, 0)) c3x3_b = self.bn_ac_conv( data=c1x1_a, num_filter=num_3x3_b, kernel=(kw, kh), pad=(pw, ph), stride=(key_stride, key_stride), - num_group=G, - name=name + "_conv2") + num_group=G) c1x1_c = self.bn_ac_conv( data=c3x3_b, num_filter=(num_1x1_c + inc), kernel=(1, 1), - pad=(0, 0), - name=name + "_conv3") + pad=(0, 0)) c1x1_c1, c1x1_c2 = fluid.layers.split( - c1x1_c, - num_or_sections=[num_1x1_c, inc], - dim=1, - name=name + "_conv3_Slice") + c1x1_c, num_or_sections=[num_1x1_c, inc], dim=1) # OUTPUTS - summ = fluid.layers.elementwise_add( - x=data_o1, y=c1x1_c1, name=name + "_elewise") - dense = fluid.layers.concat( - [data_o2, c1x1_c2], axis=1, name=name + "_concat") + summ = fluid.layers.elementwise_add(x=data_o1, y=c1x1_c1) + dense = fluid.layers.concat([data_o2, c1x1_c2], axis=1) return [summ, dense] @@ -284,17 +239,8 @@ class DPN(object): kernel, pad, stride=(1, 1), - num_group=1, - name=None): - bn_ac = fluid.layers.batch_norm( - input=data, - act='relu', - is_test=False, - name=name + '.output.1', - param_attr=ParamAttr(name=name + '_bn_scale'), - bias_attr=ParamAttr(name + '_bn_offset'), - moving_mean_name=name + '_bn_mean', - moving_variance_name=name + '_bn_variance', ) + num_group=1): + bn_ac = fluid.layers.batch_norm(input=data, act='relu', is_test=False) bn_ac_conv = fluid.layers.conv2d( input=bn_ac, num_filters=num_filter, @@ -303,8 +249,7 @@ class DPN(object): padding=pad, groups=num_group, act=None, - bias_attr=False, - param_attr=ParamAttr(name=name + "_weights")) + bias_attr=False) return bn_ac_conv @@ -314,7 +259,7 @@ def DPN68(): def DPN92(): - onvodel = DPN(layers=92) + model = DPN(layers=92) return model diff --git a/fluid/PaddleCV/image_classification/legacy/models/googlenet.py b/fluid/PaddleCV/image_classification/legacy/models/googlenet.py new file mode 100644 index 0000000000000000000000000000000000000000..be52ed96fcb801cc4a7d69d61470dd5732ff044c --- /dev/null +++ b/fluid/PaddleCV/image_classification/legacy/models/googlenet.py @@ -0,0 +1,167 @@ +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +import paddle +import paddle.fluid as fluid + +__all__ = ['GoogleNet'] + +train_parameters = { + "input_size": [3, 224, 224], + "input_mean": [0.485, 0.456, 0.406], + "input_std": [0.229, 0.224, 0.225], + "learning_strategy": { + "name": "piecewise_decay", + "batch_size": 256, + "epochs": [30, 70, 100], + "steps": [0.1, 0.01, 0.001, 0.0001] + } +} + + +class GoogleNet(): + def __init__(self): + self.params = train_parameters + + def conv_layer(self, + input, + num_filters, + filter_size, + stride=1, + groups=1, + act=None): + channels = input.shape[1] + stdv = (3.0 / (filter_size**2 * channels))**0.5 + param_attr = fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv)) + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=act, + param_attr=param_attr, + bias_attr=False) + return conv + + def xavier(self, channels, filter_size): + stdv = (3.0 / (filter_size**2 * channels))**0.5 + param_attr = fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv)) + return param_attr + + def inception(self, name, input, channels, filter1, filter3R, filter3, + filter5R, filter5, proj): + conv1 = self.conv_layer( + input=input, num_filters=filter1, filter_size=1, stride=1, act=None) + conv3r = self.conv_layer( + input=input, + num_filters=filter3R, + filter_size=1, + stride=1, + act=None) + conv3 = self.conv_layer( + input=conv3r, + num_filters=filter3, + filter_size=3, + stride=1, + act=None) + conv5r = self.conv_layer( + input=input, + num_filters=filter5R, + filter_size=1, + stride=1, + act=None) + conv5 = self.conv_layer( + input=conv5r, + num_filters=filter5, + filter_size=5, + stride=1, + act=None) + pool = fluid.layers.pool2d( + input=input, + pool_size=3, + pool_stride=1, + pool_padding=1, + pool_type='max') + convprj = fluid.layers.conv2d( + input=pool, filter_size=1, num_filters=proj, stride=1, padding=0) + cat = fluid.layers.concat(input=[conv1, conv3, conv5, convprj], axis=1) + cat = fluid.layers.relu(cat) + return cat + + def net(self, input, class_dim=1000): + conv = self.conv_layer( + input=input, num_filters=64, filter_size=7, stride=2, act=None) + pool = fluid.layers.pool2d( + input=conv, pool_size=3, pool_type='max', pool_stride=2) + + conv = self.conv_layer( + input=pool, num_filters=64, filter_size=1, stride=1, act=None) + conv = self.conv_layer( + input=conv, num_filters=192, filter_size=3, stride=1, act=None) + pool = fluid.layers.pool2d( + input=conv, pool_size=3, pool_type='max', pool_stride=2) + + ince3a = self.inception("ince3a", pool, 192, 64, 96, 128, 16, 32, 32) + ince3b = self.inception("ince3b", ince3a, 256, 128, 128, 192, 32, 96, + 64) + pool3 = fluid.layers.pool2d( + input=ince3b, pool_size=3, pool_type='max', pool_stride=2) + + ince4a = self.inception("ince4a", pool3, 480, 192, 96, 208, 16, 48, 64) + ince4b = self.inception("ince4b", ince4a, 512, 160, 112, 224, 24, 64, + 64) + ince4c = self.inception("ince4c", ince4b, 512, 128, 128, 256, 24, 64, + 64) + ince4d = self.inception("ince4d", ince4c, 512, 112, 144, 288, 32, 64, + 64) + ince4e = self.inception("ince4e", ince4d, 528, 256, 160, 320, 32, 128, + 128) + pool4 = fluid.layers.pool2d( + input=ince4e, pool_size=3, pool_type='max', pool_stride=2) + + ince5a = self.inception("ince5a", pool4, 832, 256, 160, 320, 32, 128, + 128) + ince5b = self.inception("ince5b", ince5a, 832, 384, 192, 384, 48, 128, + 128) + pool5 = fluid.layers.pool2d( + input=ince5b, pool_size=7, pool_type='avg', pool_stride=7) + dropout = fluid.layers.dropout(x=pool5, dropout_prob=0.4) + out = fluid.layers.fc(input=dropout, + size=class_dim, + act='softmax', + param_attr=self.xavier(1024, 1)) + + pool_o1 = fluid.layers.pool2d( + input=ince4a, pool_size=5, pool_type='avg', pool_stride=3) + conv_o1 = self.conv_layer( + input=pool_o1, num_filters=128, filter_size=1, stride=1, act=None) + fc_o1 = fluid.layers.fc(input=conv_o1, + size=1024, + act='relu', + param_attr=self.xavier(2048, 1)) + dropout_o1 = fluid.layers.dropout(x=fc_o1, dropout_prob=0.7) + out1 = fluid.layers.fc(input=dropout_o1, + size=class_dim, + act='softmax', + param_attr=self.xavier(1024, 1)) + + pool_o2 = fluid.layers.pool2d( + input=ince4d, pool_size=5, pool_type='avg', pool_stride=3) + conv_o2 = self.conv_layer( + input=pool_o2, num_filters=128, filter_size=1, stride=1, act=None) + fc_o2 = fluid.layers.fc(input=conv_o2, + size=1024, + act='relu', + param_attr=self.xavier(2048, 1)) + dropout_o2 = fluid.layers.dropout(x=fc_o2, dropout_prob=0.7) + out2 = fluid.layers.fc(input=dropout_o2, + size=class_dim, + act='softmax', + param_attr=self.xavier(1024, 1)) + + # last fc layer is "out" + return out, out1, out2 diff --git a/fluid/PaddleCV/image_classification/legacy/models/inception_v4.py b/fluid/PaddleCV/image_classification/legacy/models/inception_v4.py new file mode 100644 index 0000000000000000000000000000000000000000..1520375477ade6e61f0a5584278b13e40ab541eb --- /dev/null +++ b/fluid/PaddleCV/image_classification/legacy/models/inception_v4.py @@ -0,0 +1,206 @@ +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +import paddle +import paddle.fluid as fluid +import math + +__all__ = ['InceptionV4'] + +train_parameters = { + "input_size": [3, 224, 224], + "input_mean": [0.485, 0.456, 0.406], + "input_std": [0.229, 0.224, 0.225], + "learning_strategy": { + "name": "piecewise_decay", + "batch_size": 256, + "epochs": [30, 60, 90], + "steps": [0.1, 0.01, 0.001, 0.0001] + } +} + + +class InceptionV4(): + def __init__(self): + self.params = train_parameters + + def net(self, input, class_dim=1000): + x = self.inception_stem(input) + + for i in range(4): + x = self.inceptionA(x) + x = self.reductionA(x) + + for i in range(7): + x = self.inceptionB(x) + x = self.reductionB(x) + + for i in range(3): + x = self.inceptionC(x) + + pool = fluid.layers.pool2d( + input=x, pool_size=8, pool_type='avg', global_pooling=True) + + drop = fluid.layers.dropout(x=pool, dropout_prob=0.2) + + stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0) + out = fluid.layers.fc( + input=drop, + size=class_dim, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv))) + return out + + def conv_bn_layer(self, + data, + num_filters, + filter_size, + stride=1, + padding=0, + groups=1, + act='relu'): + conv = fluid.layers.conv2d( + input=data, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=padding, + groups=groups, + act=None, + bias_attr=False) + return fluid.layers.batch_norm(input=conv, act=act) + + def inception_stem(self, data): + conv = self.conv_bn_layer(data, 32, 3, stride=2, act='relu') + conv = self.conv_bn_layer(conv, 32, 3, act='relu') + conv = self.conv_bn_layer(conv, 64, 3, padding=1, act='relu') + + pool1 = fluid.layers.pool2d( + input=conv, pool_size=3, pool_stride=2, pool_type='max') + conv2 = self.conv_bn_layer(conv, 96, 3, stride=2, act='relu') + concat = fluid.layers.concat([pool1, conv2], axis=1) + + conv1 = self.conv_bn_layer(concat, 64, 1, act='relu') + conv1 = self.conv_bn_layer(conv1, 96, 3, act='relu') + + conv2 = self.conv_bn_layer(concat, 64, 1, act='relu') + conv2 = self.conv_bn_layer( + conv2, 64, (7, 1), padding=(3, 0), act='relu') + conv2 = self.conv_bn_layer( + conv2, 64, (1, 7), padding=(0, 3), act='relu') + conv2 = self.conv_bn_layer(conv2, 96, 3, act='relu') + + concat = fluid.layers.concat([conv1, conv2], axis=1) + + conv1 = self.conv_bn_layer(concat, 192, 3, stride=2, act='relu') + pool1 = fluid.layers.pool2d( + input=concat, pool_size=3, pool_stride=2, pool_type='max') + + concat = fluid.layers.concat([conv1, pool1], axis=1) + + return concat + + def inceptionA(self, data): + pool1 = fluid.layers.pool2d( + input=data, pool_size=3, pool_padding=1, pool_type='avg') + conv1 = self.conv_bn_layer(pool1, 96, 1, act='relu') + + conv2 = self.conv_bn_layer(data, 96, 1, act='relu') + + conv3 = self.conv_bn_layer(data, 64, 1, act='relu') + conv3 = self.conv_bn_layer(conv3, 96, 3, padding=1, act='relu') + + conv4 = self.conv_bn_layer(data, 64, 1, act='relu') + conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu') + conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu') + + concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) + + return concat + + def reductionA(self, data): + pool1 = fluid.layers.pool2d( + input=data, pool_size=3, pool_stride=2, pool_type='max') + + conv2 = self.conv_bn_layer(data, 384, 3, stride=2, act='relu') + + conv3 = self.conv_bn_layer(data, 192, 1, act='relu') + conv3 = self.conv_bn_layer(conv3, 224, 3, padding=1, act='relu') + conv3 = self.conv_bn_layer(conv3, 256, 3, stride=2, act='relu') + + concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) + + return concat + + def inceptionB(self, data): + pool1 = fluid.layers.pool2d( + input=data, pool_size=3, pool_padding=1, pool_type='avg') + conv1 = self.conv_bn_layer(pool1, 128, 1, act='relu') + + conv2 = self.conv_bn_layer(data, 384, 1, act='relu') + + conv3 = self.conv_bn_layer(data, 192, 1, act='relu') + conv3 = self.conv_bn_layer( + conv3, 224, (1, 7), padding=(0, 3), act='relu') + conv3 = self.conv_bn_layer( + conv3, 256, (7, 1), padding=(3, 0), act='relu') + + conv4 = self.conv_bn_layer(data, 192, 1, act='relu') + conv4 = self.conv_bn_layer( + conv4, 192, (1, 7), padding=(0, 3), act='relu') + conv4 = self.conv_bn_layer( + conv4, 224, (7, 1), padding=(3, 0), act='relu') + conv4 = self.conv_bn_layer( + conv4, 224, (1, 7), padding=(0, 3), act='relu') + conv4 = self.conv_bn_layer( + conv4, 256, (7, 1), padding=(3, 0), act='relu') + + concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) + + return concat + + def reductionB(self, data): + pool1 = fluid.layers.pool2d( + input=data, pool_size=3, pool_stride=2, pool_type='max') + + conv2 = self.conv_bn_layer(data, 192, 1, act='relu') + conv2 = self.conv_bn_layer(conv2, 192, 3, stride=2, act='relu') + + conv3 = self.conv_bn_layer(data, 256, 1, act='relu') + conv3 = self.conv_bn_layer( + conv3, 256, (1, 7), padding=(0, 3), act='relu') + conv3 = self.conv_bn_layer( + conv3, 320, (7, 1), padding=(3, 0), act='relu') + conv3 = self.conv_bn_layer(conv3, 320, 3, stride=2, act='relu') + + concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) + + return concat + + def inceptionC(self, data): + pool1 = fluid.layers.pool2d( + input=data, pool_size=3, pool_padding=1, pool_type='avg') + conv1 = self.conv_bn_layer(pool1, 256, 1, act='relu') + + conv2 = self.conv_bn_layer(data, 256, 1, act='relu') + + conv3 = self.conv_bn_layer(data, 384, 1, act='relu') + conv3_1 = self.conv_bn_layer( + conv3, 256, (1, 3), padding=(0, 1), act='relu') + conv3_2 = self.conv_bn_layer( + conv3, 256, (3, 1), padding=(1, 0), act='relu') + + conv4 = self.conv_bn_layer(data, 384, 1, act='relu') + conv4 = self.conv_bn_layer( + conv4, 448, (1, 3), padding=(0, 1), act='relu') + conv4 = self.conv_bn_layer( + conv4, 512, (3, 1), padding=(1, 0), act='relu') + conv4_1 = self.conv_bn_layer( + conv4, 256, (1, 3), padding=(0, 1), act='relu') + conv4_2 = self.conv_bn_layer( + conv4, 256, (3, 1), padding=(1, 0), act='relu') + + concat = fluid.layers.concat( + [conv1, conv2, conv3_1, conv3_2, conv4_1, conv4_2], axis=1) + + return concat diff --git a/fluid/PaddleCV/image_classification/models_name/mobilenet.py b/fluid/PaddleCV/image_classification/legacy/models/mobilenet.py similarity index 70% rename from fluid/PaddleCV/image_classification/models_name/mobilenet.py rename to fluid/PaddleCV/image_classification/legacy/models/mobilenet.py index d242bc946a7b4bec9c9d2e34da2496c0901ba870..d0b419e8b4083104ba529c9f886284aa724953e6 100644 --- a/fluid/PaddleCV/image_classification/models_name/mobilenet.py +++ b/fluid/PaddleCV/image_classification/legacy/models/mobilenet.py @@ -32,8 +32,7 @@ class MobileNet(): channels=3, num_filters=int(32 * scale), stride=2, - padding=1, - name="conv1") + padding=1) # 56x56 input = self.depthwise_separable( @@ -42,8 +41,7 @@ class MobileNet(): num_filters2=64, num_groups=32, stride=1, - scale=scale, - name="conv2_1") + scale=scale) input = self.depthwise_separable( input, @@ -51,8 +49,7 @@ class MobileNet(): num_filters2=128, num_groups=64, stride=2, - scale=scale, - name="conv2_2") + scale=scale) # 28x28 input = self.depthwise_separable( @@ -61,8 +58,7 @@ class MobileNet(): num_filters2=128, num_groups=128, stride=1, - scale=scale, - name="conv3_1") + scale=scale) input = self.depthwise_separable( input, @@ -70,8 +66,7 @@ class MobileNet(): num_filters2=256, num_groups=128, stride=2, - scale=scale, - name="conv3_2") + scale=scale) # 14x14 input = self.depthwise_separable( @@ -80,8 +75,7 @@ class MobileNet(): num_filters2=256, num_groups=256, stride=1, - scale=scale, - name="conv4_1") + scale=scale) input = self.depthwise_separable( input, @@ -89,8 +83,7 @@ class MobileNet(): num_filters2=512, num_groups=256, stride=2, - scale=scale, - name="conv4_2") + scale=scale) # 14x14 for i in range(5): @@ -100,8 +93,7 @@ class MobileNet(): num_filters2=512, num_groups=512, stride=1, - scale=scale, - name="conv5" + "_" + str(i + 1)) + scale=scale) # 7x7 input = self.depthwise_separable( input, @@ -109,8 +101,7 @@ class MobileNet(): num_filters2=1024, num_groups=512, stride=2, - scale=scale, - name="conv5_6") + scale=scale) input = self.depthwise_separable( input, @@ -118,8 +109,7 @@ class MobileNet(): num_filters2=1024, num_groups=1024, stride=1, - scale=scale, - name="conv6") + scale=scale) input = fluid.layers.pool2d( input=input, @@ -130,9 +120,7 @@ class MobileNet(): output = fluid.layers.fc(input=input, size=class_dim, - param_attr=ParamAttr( - initializer=MSRA(), name="fc7_weights"), - bias_attr=ParamAttr(name="fc7_offset")) + param_attr=ParamAttr(initializer=MSRA())) return output def conv_bn_layer(self, @@ -144,8 +132,7 @@ class MobileNet(): channels=None, num_groups=1, act='relu', - use_cudnn=True, - name=None): + use_cudnn=True): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -155,26 +142,12 @@ class MobileNet(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr( - initializer=MSRA(), name=name + "_weights"), + param_attr=ParamAttr(initializer=MSRA()), bias_attr=False) - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - param_attr=ParamAttr(name=bn_name + "_scale"), - bias_attr=ParamAttr(name=bn_name + "_offset"), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') - - def depthwise_separable(self, - input, - num_filters1, - num_filters2, - num_groups, - stride, - scale, - name=None): + return fluid.layers.batch_norm(input=conv, act=act) + + def depthwise_separable(self, input, num_filters1, num_filters2, num_groups, + stride, scale): depthwise_conv = self.conv_bn_layer( input=input, filter_size=3, @@ -182,14 +155,12 @@ class MobileNet(): stride=stride, padding=1, num_groups=int(num_groups * scale), - use_cudnn=False, - name=name + "_dw") + use_cudnn=False) pointwise_conv = self.conv_bn_layer( input=depthwise_conv, filter_size=1, num_filters=int(num_filters2 * scale), stride=1, - padding=0, - name=name + "_sep") + padding=0) return pointwise_conv diff --git a/fluid/PaddleCV/image_classification/models_name/mobilenet_v2.py b/fluid/PaddleCV/image_classification/legacy/models/mobilenet_v2.py similarity index 71% rename from fluid/PaddleCV/image_classification/models_name/mobilenet_v2.py rename to fluid/PaddleCV/image_classification/legacy/models/mobilenet_v2.py index 77d88c7da625c0c953c75d229148868f0481f2a2..c219b1bf5a7260fbb07627bc3fa039f4b2833092 100644 --- a/fluid/PaddleCV/image_classification/models_name/mobilenet_v2.py +++ b/fluid/PaddleCV/image_classification/legacy/models/mobilenet_v2.py @@ -36,40 +36,33 @@ class MobileNetV2(): (6, 320, 1, 1), ] - #conv1 input = self.conv_bn_layer( input, num_filters=int(32 * scale), filter_size=3, stride=2, padding=1, - if_act=True, - name='conv1_1') + if_act=True) - # bottleneck sequences - i = 1 in_c = int(32 * scale) for layer_setting in bottleneck_params_list: t, c, n, s = layer_setting - i += 1 input = self.invresi_blocks( input=input, in_c=in_c, t=t, c=int(c * scale), n=n, - s=s, - name='conv' + str(i)) + s=s, ) in_c = int(c * scale) - #last_conv + input = self.conv_bn_layer( input=input, num_filters=int(1280 * scale) if scale > 1.0 else 1280, filter_size=1, stride=1, padding=0, - if_act=True, - name='conv9') + if_act=True) input = fluid.layers.pool2d( input=input, @@ -80,8 +73,7 @@ class MobileNetV2(): output = fluid.layers.fc(input=input, size=class_dim, - param_attr=ParamAttr(name='fc10_weights'), - bias_attr=ParamAttr(name='fc10_offset')) + param_attr=ParamAttr(initializer=MSRA())) return output def conv_bn_layer(self, @@ -92,9 +84,8 @@ class MobileNetV2(): padding, channels=None, num_groups=1, - if_act=True, - name=None, - use_cudnn=True): + use_cudnn=True, + if_act=True): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -104,15 +95,9 @@ class MobileNetV2(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr(name=name + '_weights'), + param_attr=ParamAttr(initializer=MSRA()), bias_attr=False) - bn_name = name + '_bn' - bn = fluid.layers.batch_norm( - input=conv, - param_attr=ParamAttr(name=bn_name + "_scale"), - bias_attr=ParamAttr(name=bn_name + "_offset"), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') + bn = fluid.layers.batch_norm(input=conv) if if_act: return fluid.layers.relu6(bn) else: @@ -121,18 +106,10 @@ class MobileNetV2(): def shortcut(self, input, data_residual): return fluid.layers.elementwise_add(input, data_residual) - def inverted_residual_unit(self, - input, - num_in_filter, - num_filters, - ifshortcut, - stride, - filter_size, - padding, - expansion_factor, - name=None): + def inverted_residual_unit(self, input, num_in_filter, num_filters, + ifshortcut, stride, filter_size, padding, + expansion_factor): num_expfilter = int(round(num_in_filter * expansion_factor)) - channel_expand = self.conv_bn_layer( input=input, num_filters=num_expfilter, @@ -140,9 +117,7 @@ class MobileNetV2(): stride=1, padding=0, num_groups=1, - if_act=True, - name=name + '_expand') - + if_act=True) bottleneck_conv = self.conv_bn_layer( input=channel_expand, num_filters=num_expfilter, @@ -151,9 +126,7 @@ class MobileNetV2(): padding=padding, num_groups=num_expfilter, if_act=True, - name=name + '_dwise', use_cudnn=False) - linear_out = self.conv_bn_layer( input=bottleneck_conv, num_filters=num_filters, @@ -161,15 +134,14 @@ class MobileNetV2(): stride=1, padding=0, num_groups=1, - if_act=False, - name=name + '_linear') + if_act=False) if ifshortcut: out = self.shortcut(input=input, data_residual=linear_out) return out else: return linear_out - def invresi_blocks(self, input, in_c, t, c, n, s, name=None): + def invresi_blocks(self, input, in_c, t, c, n, s): first_block = self.inverted_residual_unit( input=input, num_in_filter=in_c, @@ -178,8 +150,7 @@ class MobileNetV2(): stride=s, filter_size=3, padding=1, - expansion_factor=t, - name=name + '_1') + expansion_factor=t) last_residual_block = first_block last_c = c @@ -193,6 +164,5 @@ class MobileNetV2(): stride=1, filter_size=3, padding=1, - expansion_factor=t, - name=name + '_' + str(i + 1)) + expansion_factor=t) return last_residual_block diff --git a/fluid/PaddleCV/image_classification/models_name/resnet.py b/fluid/PaddleCV/image_classification/legacy/models/resnet.py similarity index 59% rename from fluid/PaddleCV/image_classification/models_name/resnet.py rename to fluid/PaddleCV/image_classification/legacy/models/resnet.py index 19fa4ff2c4c30d0fa11b592c21f3db5e51663159..def99db6d84673b77582cf93374f4cb2f00e9ac5 100644 --- a/fluid/PaddleCV/image_classification/models_name/resnet.py +++ b/fluid/PaddleCV/image_classification/legacy/models/resnet.py @@ -4,7 +4,6 @@ from __future__ import print_function import paddle import paddle.fluid as fluid import math -from paddle.fluid.param_attr import ParamAttr __all__ = ["ResNet", "ResNet50", "ResNet101", "ResNet152"] @@ -41,12 +40,7 @@ class ResNet(): num_filters = [64, 128, 256, 512] conv = self.conv_bn_layer( - input=input, - num_filters=64, - filter_size=7, - stride=2, - act='relu', - name="conv1") + input=input, num_filters=64, filter_size=7, stride=2, act='relu') conv = fluid.layers.pool2d( input=conv, pool_size=3, @@ -56,18 +50,10 @@ class ResNet(): for block in range(len(depth)): for i in range(depth[block]): - if layers in [101, 152] and block == 2: - if i == 0: - conv_name = "res" + str(block + 2) + "a" - else: - conv_name = "res" + str(block + 2) + "b" + str(i) - else: - conv_name = "res" + str(block + 2) + chr(97 + i) conv = self.bottleneck_block( input=conv, num_filters=num_filters[block], - stride=2 if i == 0 and block != 0 else 1, - name=conv_name) + stride=2 if i == 0 and block != 0 else 1) pool = fluid.layers.pool2d( input=conv, pool_size=7, pool_type='avg', global_pooling=True) @@ -85,8 +71,7 @@ class ResNet(): filter_size, stride=1, groups=1, - act=None, - name=None): + act=None): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -95,55 +80,31 @@ class ResNet(): padding=(filter_size - 1) // 2, groups=groups, act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name + '.conv2d.output.1') - if name == "conv1": - bn_name = "bn_" + name - else: - bn_name = "bn" + name[3:] - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '.output.1', - param_attr=ParamAttr(name=bn_name + '_scale'), - bias_attr=ParamAttr(bn_name + '_offset'), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance', ) - - def shortcut(self, input, ch_out, stride, name): + bias_attr=False) + return fluid.layers.batch_norm(input=conv, act=act) + + def shortcut(self, input, ch_out, stride): ch_in = input.shape[1] if ch_in != ch_out or stride != 1: - return self.conv_bn_layer(input, ch_out, 1, stride, name=name) + return self.conv_bn_layer(input, ch_out, 1, stride) else: return input - def bottleneck_block(self, input, num_filters, stride, name): + def bottleneck_block(self, input, num_filters, stride): conv0 = self.conv_bn_layer( - input=input, - num_filters=num_filters, - filter_size=1, - act='relu', - name=name + "_branch2a") + input=input, num_filters=num_filters, filter_size=1, act='relu') conv1 = self.conv_bn_layer( input=conv0, num_filters=num_filters, filter_size=3, stride=stride, - act='relu', - name=name + "_branch2b") + act='relu') conv2 = self.conv_bn_layer( - input=conv1, - num_filters=num_filters * 4, - filter_size=1, - act=None, - name=name + "_branch2c") + input=conv1, num_filters=num_filters * 4, filter_size=1, act=None) - short = self.shortcut( - input, num_filters * 4, stride, name=name + "_branch1") + short = self.shortcut(input, num_filters * 4, stride) - return fluid.layers.elementwise_add( - x=short, y=conv2, act='relu', name=name + ".add.output.5") + return fluid.layers.elementwise_add(x=short, y=conv2, act='relu') def ResNet50(): diff --git a/fluid/PaddleCV/image_classification/models_name/se_resnext.py b/fluid/PaddleCV/image_classification/legacy/models/se_resnext.py similarity index 59% rename from fluid/PaddleCV/image_classification/models_name/se_resnext.py rename to fluid/PaddleCV/image_classification/legacy/models/se_resnext.py index 0ae3d66fddbe2d1b9da5e2f52fe80d15931d256d..ac50bd87b5070000a018949e777a897427c3e5a5 100644 --- a/fluid/PaddleCV/image_classification/models_name/se_resnext.py +++ b/fluid/PaddleCV/image_classification/legacy/models/se_resnext.py @@ -4,7 +4,6 @@ from __future__ import print_function import paddle import paddle.fluid as fluid import math -from paddle.fluid.param_attr import ParamAttr __all__ = [ "SE_ResNeXt", "SE_ResNeXt50_32x4d", "SE_ResNeXt101_32x4d", @@ -19,7 +18,7 @@ train_parameters = { "learning_strategy": { "name": "piecewise_decay", "batch_size": 256, - "epochs": [40, 80, 100], + "epochs": [30, 60, 90], "steps": [0.1, 0.01, 0.001, 0.0001] } } @@ -46,8 +45,7 @@ class SE_ResNeXt(): num_filters=64, filter_size=7, stride=2, - act='relu', - name='conv1', ) + act='relu') conv = fluid.layers.pool2d( input=conv, pool_size=3, @@ -65,8 +63,7 @@ class SE_ResNeXt(): num_filters=64, filter_size=7, stride=2, - act='relu', - name="conv1", ) + act='relu') conv = fluid.layers.pool2d( input=conv, pool_size=3, @@ -84,94 +81,67 @@ class SE_ResNeXt(): num_filters=64, filter_size=3, stride=2, - act='relu', - name='conv1') + act='relu') conv = self.conv_bn_layer( - input=conv, - num_filters=64, - filter_size=3, - stride=1, - act='relu', - name='conv2') + input=conv, num_filters=64, filter_size=3, stride=1, act='relu') conv = self.conv_bn_layer( input=conv, num_filters=128, filter_size=3, stride=1, - act='relu', - name='conv3') + act='relu') conv = fluid.layers.pool2d( input=conv, pool_size=3, pool_stride=2, pool_padding=1, \ pool_type='max') - n = 1 if layers == 50 or layers == 101 else 3 + for block in range(len(depth)): - n += 1 for i in range(depth[block]): conv = self.bottleneck_block( input=conv, num_filters=num_filters[block], stride=2 if i == 0 and block != 0 else 1, cardinality=cardinality, - reduction_ratio=reduction_ratio, - name=str(n) + '_' + str(i + 1)) + reduction_ratio=reduction_ratio) pool = fluid.layers.pool2d( input=conv, pool_size=7, pool_type='avg', global_pooling=True) drop = fluid.layers.dropout( x=pool, dropout_prob=0.5, seed=self.params['dropout_seed']) stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0) - out = fluid.layers.fc( - input=drop, - size=class_dim, - param_attr=ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name='fc6_weights'), - bias_attr=ParamAttr(name='fc6_offset')) + out = fluid.layers.fc(input=drop, + size=class_dim, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, + stdv))) return out - def shortcut(self, input, ch_out, stride, name): + def shortcut(self, input, ch_out, stride): ch_in = input.shape[1] if ch_in != ch_out or stride != 1: filter_size = 1 - return self.conv_bn_layer( - input, ch_out, filter_size, stride, name='conv' + name + '_prj') + return self.conv_bn_layer(input, ch_out, filter_size, stride) else: return input - def bottleneck_block(self, - input, - num_filters, - stride, - cardinality, - reduction_ratio, - name=None): + def bottleneck_block(self, input, num_filters, stride, cardinality, + reduction_ratio): conv0 = self.conv_bn_layer( - input=input, - num_filters=num_filters, - filter_size=1, - act='relu', - name='conv' + name + '_x1') + input=input, num_filters=num_filters, filter_size=1, act='relu') conv1 = self.conv_bn_layer( input=conv0, num_filters=num_filters, filter_size=3, stride=stride, groups=cardinality, - act='relu', - name='conv' + name + '_x2') + act='relu') conv2 = self.conv_bn_layer( - input=conv1, - num_filters=num_filters * 2, - filter_size=1, - act=None, - name='conv' + name + '_x3') + input=conv1, num_filters=num_filters * 2, filter_size=1, act=None) scale = self.squeeze_excitation( input=conv2, num_channels=num_filters * 2, - reduction_ratio=reduction_ratio, - name='fc' + name) + reduction_ratio=reduction_ratio) - short = self.shortcut(input, num_filters * 2, stride, name=name) + short = self.shortcut(input, num_filters * 2, stride) return fluid.layers.elementwise_add(x=short, y=scale, act='relu') @@ -181,8 +151,7 @@ class SE_ResNeXt(): filter_size, stride=1, groups=1, - act=None, - name=None): + act=None): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -191,42 +160,26 @@ class SE_ResNeXt(): padding=(filter_size - 1) // 2, groups=groups, act=None, - bias_attr=False, - param_attr=ParamAttr(name=name + '_weights'), ) - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - param_attr=ParamAttr(name=bn_name + '_scale'), - bias_attr=ParamAttr(bn_name + '_offset'), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') - - def squeeze_excitation(self, - input, - num_channels, - reduction_ratio, - name=None): + bias_attr=False) + return fluid.layers.batch_norm(input=conv, act=act) + + def squeeze_excitation(self, input, num_channels, reduction_ratio): pool = fluid.layers.pool2d( input=input, pool_size=0, pool_type='avg', global_pooling=True) stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) - squeeze = fluid.layers.fc( - input=pool, - size=num_channels // reduction_ratio, - act='relu', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=name + '_sqz_weights'), - bias_attr=ParamAttr(name=name + '_sqz_offset')) + squeeze = fluid.layers.fc(input=pool, + size=num_channels // reduction_ratio, + act='relu', + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform( + -stdv, stdv))) stdv = 1.0 / math.sqrt(squeeze.shape[1] * 1.0) - excitation = fluid.layers.fc( - input=squeeze, - size=num_channels, - act='sigmoid', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=name + '_exc_weights'), - bias_attr=ParamAttr(name=name + '_exc_offset')) + excitation = fluid.layers.fc(input=squeeze, + size=num_channels, + act='sigmoid', + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform( + -stdv, stdv))) scale = fluid.layers.elementwise_mul(x=input, y=excitation, axis=0) return scale diff --git a/fluid/PaddleCV/image_classification/models_name/shufflenet_v2.py b/fluid/PaddleCV/image_classification/legacy/models/shufflenet_v2.py similarity index 71% rename from fluid/PaddleCV/image_classification/models_name/shufflenet_v2.py rename to fluid/PaddleCV/image_classification/legacy/models/shufflenet_v2.py index 595debf2199be9609100cb686aad65eb9cb55416..6db88aa769dd6b3b3e2987fcac6d8054319a2a56 100644 --- a/fluid/PaddleCV/image_classification/models_name/shufflenet_v2.py +++ b/fluid/PaddleCV/image_classification/legacy/models/shufflenet_v2.py @@ -52,8 +52,7 @@ class ShuffleNetV2(): filter_size=3, num_filters=input_channel, padding=1, - stride=2, - name='stage1_conv') + stride=2) pool1 = fluid.layers.pool2d( input=conv1, pool_size=3, @@ -71,35 +70,30 @@ class ShuffleNetV2(): input=conv, num_filters=output_channel, stride=2, - benchmodel=2, - name=str(idxstage + 2) + '_' + str(i + 1)) + benchmodel=2) else: conv = self.inverted_residual_unit( input=conv, num_filters=output_channel, stride=1, - benchmodel=1, - name=str(idxstage + 2) + '_' + str(i + 1)) + benchmodel=1) conv_last = self.conv_bn_layer( input=conv, filter_size=1, num_filters=stage_out_channels[-1], padding=0, - stride=1, - name='conv5') + stride=1) pool_last = fluid.layers.pool2d( input=conv_last, pool_size=7, - pool_stride=1, + pool_stride=7, pool_padding=0, pool_type='avg') output = fluid.layers.fc(input=pool_last, size=class_dim, - param_attr=ParamAttr( - initializer=MSRA(), name='fc6_weights'), - bias_attr=ParamAttr(name='fc6_offset')) + param_attr=ParamAttr(initializer=MSRA())) return output def conv_bn_layer(self, @@ -110,9 +104,7 @@ class ShuffleNetV2(): padding, num_groups=1, use_cudnn=True, - if_act=True, - name=None): - # print(num_groups) + if_act=True): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -122,25 +114,12 @@ class ShuffleNetV2(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr( - initializer=MSRA(), name=name + '_weights'), + param_attr=ParamAttr(initializer=MSRA()), bias_attr=False) - bn_name = name + '_bn' if if_act: - return fluid.layers.batch_norm( - input=conv, - act='relu', - param_attr=ParamAttr(name=bn_name + "_scale"), - bias_attr=ParamAttr(name=bn_name + "_offset"), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') + return fluid.layers.batch_norm(input=conv, act='relu') else: - return fluid.layers.batch_norm( - input=conv, - param_attr=ParamAttr(name=bn_name + "_scale"), - bias_attr=ParamAttr(name=bn_name + "_offset"), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') + return fluid.layers.batch_norm(input=conv) def channel_shuffle(self, x, groups): batchsize, num_channels, height, width = x.shape[0], x.shape[ @@ -159,12 +138,7 @@ class ShuffleNetV2(): return x - def inverted_residual_unit(self, - input, - num_filters, - stride, - benchmodel, - name=None): + def inverted_residual_unit(self, input, num_filters, stride, benchmodel): assert stride in [1, 2], \ "supported stride are {} but your stride is {}".format([1,2], stride) @@ -176,8 +150,6 @@ class ShuffleNetV2(): input, num_or_sections=[input.shape[1] // 2, input.shape[1] // 2], dim=1) - # x1 = input[:, :(input.shape[1]//2), :, :] - # x2 = input[:, (input.shape[1]//2):, :, :] conv_pw = self.conv_bn_layer( input=x2, @@ -186,8 +158,7 @@ class ShuffleNetV2(): stride=1, padding=0, num_groups=1, - if_act=True, - name='stage_' + name + '_conv1') + if_act=True) conv_dw = self.conv_bn_layer( input=conv_pw, @@ -196,8 +167,7 @@ class ShuffleNetV2(): stride=stride, padding=1, num_groups=oup_inc, - if_act=False, - name='stage_' + name + '_conv2') + if_act=False) conv_linear = self.conv_bn_layer( input=conv_dw, @@ -206,63 +176,57 @@ class ShuffleNetV2(): stride=1, padding=0, num_groups=1, - if_act=True, - name='stage_' + name + '_conv3') + if_act=True) out = fluid.layers.concat([x1, conv_linear], axis=1) else: #branch1 - conv_dw_1 = self.conv_bn_layer( + conv_dw = self.conv_bn_layer( input=input, num_filters=inp, filter_size=3, stride=stride, padding=1, num_groups=inp, - if_act=False, - name='stage_' + name + '_conv4') + if_act=False) conv_linear_1 = self.conv_bn_layer( - input=conv_dw_1, + input=conv_dw, num_filters=oup_inc, filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True, - name='stage_' + name + '_conv5') + if_act=True) #branch2 - conv_pw_2 = self.conv_bn_layer( + conv_pw = self.conv_bn_layer( input=input, num_filters=oup_inc, filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True, - name='stage_' + name + '_conv1') + if_act=True) - conv_dw_2 = self.conv_bn_layer( - input=conv_pw_2, + conv_dw = self.conv_bn_layer( + input=conv_pw, num_filters=oup_inc, filter_size=3, stride=stride, padding=1, num_groups=oup_inc, - if_act=False, - name='stage_' + name + '_conv2') + if_act=False) conv_linear_2 = self.conv_bn_layer( - input=conv_dw_2, + input=conv_dw, num_filters=oup_inc, filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True, - name='stage_' + name + '_conv3') + if_act=True) out = fluid.layers.concat([conv_linear_1, conv_linear_2], axis=1) return self.channel_shuffle(out, 2) diff --git a/fluid/PaddleCV/image_classification/models_name/vgg.py b/fluid/PaddleCV/image_classification/legacy/models/vgg.py similarity index 65% rename from fluid/PaddleCV/image_classification/models_name/vgg.py rename to fluid/PaddleCV/image_classification/legacy/models/vgg.py index 8fcd2d9f1c397a428685cfb7bd264f18c0d0a7e7..7f559982334575c7c2bc778e1be8a4ebf69549fc 100644 --- a/fluid/PaddleCV/image_classification/models_name/vgg.py +++ b/fluid/PaddleCV/image_classification/legacy/models/vgg.py @@ -36,37 +36,42 @@ class VGGNet(): "supported layers are {} but input layer is {}".format(vgg_spec.keys(), layers) nums = vgg_spec[layers] - conv1 = self.conv_block(input, 64, nums[0], name="conv1_") - conv2 = self.conv_block(conv1, 128, nums[1], name="conv2_") - conv3 = self.conv_block(conv2, 256, nums[2], name="conv3_") - conv4 = self.conv_block(conv3, 512, nums[3], name="conv4_") - conv5 = self.conv_block(conv4, 512, nums[4], name="conv5_") + conv1 = self.conv_block(input, 64, nums[0]) + conv2 = self.conv_block(conv1, 128, nums[1]) + conv3 = self.conv_block(conv2, 256, nums[2]) + conv4 = self.conv_block(conv3, 512, nums[3]) + conv5 = self.conv_block(conv4, 512, nums[4]) fc_dim = 4096 - fc_name = ["fc6", "fc7", "fc8"] fc1 = fluid.layers.fc( input=conv5, size=fc_dim, act='relu', - param_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_weights"), - bias_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_offset")) + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Normal(scale=0.005)), + bias_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Constant(value=0.1))) fc1 = fluid.layers.dropout(x=fc1, dropout_prob=0.5) fc2 = fluid.layers.fc( input=fc1, size=fc_dim, act='relu', - param_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_weights"), - bias_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_offset")) + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Normal(scale=0.005)), + bias_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Constant(value=0.1))) fc2 = fluid.layers.dropout(x=fc2, dropout_prob=0.5) out = fluid.layers.fc( input=fc2, size=class_dim, - param_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_weights"), - bias_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_offset")) + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Normal(scale=0.005)), + bias_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Constant(value=0.1))) return out - def conv_block(self, input, num_filter, groups, name=None): + def conv_block(self, input, num_filter, groups): conv = input for i in range(groups): conv = fluid.layers.conv2d( @@ -77,9 +82,9 @@ class VGGNet(): padding=1, act='relu', param_attr=fluid.param_attr.ParamAttr( - name=name + str(i + 1) + "_weights"), + initializer=fluid.initializer.Normal(scale=0.01)), bias_attr=fluid.param_attr.ParamAttr( - name=name + str(i + 1) + "_offset")) + initializer=fluid.initializer.Constant(value=0.0))) return fluid.layers.pool2d( input=conv, pool_size=2, pool_type='max', pool_stride=2) diff --git a/fluid/PaddleCV/image_classification/models/__init__.py b/fluid/PaddleCV/image_classification/models/__init__.py index 9659029482ed6f51c9a63ac330ad9d8fd3b8b98f..458991ca732a22f3774568ffbfa84514ddadfe5c 100644 --- a/fluid/PaddleCV/image_classification/models/__init__.py +++ b/fluid/PaddleCV/image_classification/models/__init__.py @@ -3,10 +3,10 @@ from .mobilenet import MobileNet from .mobilenet_v2 import MobileNetV2 from .googlenet import GoogleNet from .vgg import VGG11, VGG13, VGG16, VGG19 -from .resnet import ResNet50, ResNet101, ResNet152 +from .resnet import ResNet18, ResNet34, ResNet50, ResNet101, ResNet152 from .resnet_dist import DistResNet from .inception_v4 import InceptionV4 from .se_resnext import SE_ResNeXt50_32x4d, SE_ResNeXt101_32x4d, SE_ResNeXt152_32x4d from .dpn import DPN68, DPN92, DPN98, DPN107, DPN131 -from .shufflenet_v2 import ShuffleNetV2_x0_5, ShuffleNetV2_x1_0, ShuffleNetV2_x1_5, ShuffleNetV2_x2_0 -from .fast_imagenet import FastImageNet +from .shufflenet_v2 import ShuffleNetV2, ShuffleNetV2_x0_5_swish, ShuffleNetV2_x1_0_swish, ShuffleNetV2_x1_5_swish, ShuffleNetV2_x2_0_swish, ShuffleNetV2_x8_0_swish +from .fast_imagenet import FastImageNet diff --git a/fluid/PaddleCV/image_classification/models/alexnet.py b/fluid/PaddleCV/image_classification/models/alexnet.py index abe3b92965b1c16312e4ddf68809f6a4c93183fa..f063c4d6deb88905aaa5f8a5eba59903f58293e8 100644 --- a/fluid/PaddleCV/image_classification/models/alexnet.py +++ b/fluid/PaddleCV/image_classification/models/alexnet.py @@ -26,6 +26,9 @@ class AlexNet(): def net(self, input, class_dim=1000): stdv = 1.0 / math.sqrt(input.shape[1] * 11 * 11) + layer_name = [ + "conv1", "conv2", "conv3", "conv4", "conv5", "fc6", "fc7", "fc8" + ] conv1 = fluid.layers.conv2d( input=input, num_filters=64, @@ -35,9 +38,11 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[0] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[0] + "_weights")) pool1 = fluid.layers.pool2d( input=conv1, pool_size=3, @@ -55,9 +60,11 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[1] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[1] + "_weights")) pool2 = fluid.layers.pool2d( input=conv2, pool_size=3, @@ -75,9 +82,11 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[2] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[2] + "_weights")) stdv = 1.0 / math.sqrt(conv3.shape[1] * 3 * 3) conv4 = fluid.layers.conv2d( @@ -89,9 +98,11 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[3] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[3] + "_weights")) stdv = 1.0 / math.sqrt(conv4.shape[1] * 3 * 3) conv5 = fluid.layers.conv2d( @@ -103,9 +114,11 @@ class AlexNet(): groups=1, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[4] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[4] + "_weights")) pool5 = fluid.layers.pool2d( input=conv5, pool_size=3, @@ -114,36 +127,42 @@ class AlexNet(): pool_type='max') drop6 = fluid.layers.dropout(x=pool5, dropout_prob=0.5) - stdv = 1.0 / math.sqrt(drop6.shape[1] * drop6.shape[2] * drop6.shape[3] * 1.0) + fc6 = fluid.layers.fc( input=drop6, size=4096, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[5] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[5] + "_weights")) drop7 = fluid.layers.dropout(x=fc6, dropout_prob=0.5) - stdv = 1.0 / math.sqrt(drop7.shape[1] * 1.0) + fc7 = fluid.layers.fc( input=drop7, size=4096, act='relu', bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[6] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[6] + "_weights")) stdv = 1.0 / math.sqrt(fc7.shape[1] * 1.0) out = fluid.layers.fc( input=fc7, size=class_dim, bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)), + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[7] + "_offset"), param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=layer_name[7] + "_weights")) return out diff --git a/fluid/PaddleCV/image_classification/models/dpn.py b/fluid/PaddleCV/image_classification/models/dpn.py index 316e96ac2cebd2dec4a60bf1748ed321fa651590..7f759b3bb6bfa9c866e129ac93ab2c6a9cf4168c 100644 --- a/fluid/PaddleCV/image_classification/models/dpn.py +++ b/fluid/PaddleCV/image_classification/models/dpn.py @@ -7,6 +7,7 @@ import time import sys import paddle.fluid as fluid import math +from paddle.fluid.param_attr import ParamAttr __all__ = ["DPN", "DPN68", "DPN92", "DPN98", "DPN107", "DPN131"] @@ -52,16 +53,30 @@ class DPN(object): padding=init_padding, groups=1, act=None, - bias_attr=False) + bias_attr=False, + name="conv1", + param_attr=ParamAttr(name="conv1_weights"), ) + conv1_x_1 = fluid.layers.batch_norm( - input=conv1_x_1, act='relu', is_test=False) + input=conv1_x_1, + act='relu', + is_test=False, + name="conv1_bn", + param_attr=ParamAttr(name='conv1_bn_scale'), + bias_attr=ParamAttr('conv1_bn_offset'), + moving_mean_name='conv1_bn_mean', + moving_variance_name='conv1_bn_variance', ) + convX_x_x = fluid.layers.pool2d( input=conv1_x_1, pool_size=3, pool_stride=2, pool_padding=1, - pool_type='max') + pool_type='max', + name="pool1") + #conv2 - conv5 + match_list, num = [], 0 for gc in range(4): bw = bws[gc] inc = inc_sec[gc] @@ -69,32 +84,46 @@ class DPN(object): if gc == 0: _type1 = 'proj' _type2 = 'normal' + match = 1 else: _type1 = 'down' _type2 = 'normal' - convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G, - _type1) + match = match + k_sec[gc - 1] + match_list.append(match) + + convX_x_x = self.dual_path_factory( + convX_x_x, R, R, bw, inc, G, _type1, name="dpn" + str(match)) for i_ly in range(2, k_sec[gc] + 1): - convX_x_x = self.dual_path_factory(convX_x_x, R, R, bw, inc, G, - _type2) + num += 1 + if num in match_list: + num += 1 + convX_x_x = self.dual_path_factory( + convX_x_x, R, R, bw, inc, G, _type2, name="dpn" + str(num)) conv5_x_x = fluid.layers.concat(convX_x_x, axis=1) conv5_x_x = fluid.layers.batch_norm( - input=conv5_x_x, act='relu', is_test=False) + input=conv5_x_x, + act='relu', + is_test=False, + name="final_concat_bn", + param_attr=ParamAttr(name='final_concat_bn_scale'), + bias_attr=ParamAttr('final_concat_bn_offset'), + moving_mean_name='final_concat_bn_mean', + moving_variance_name='final_concat_bn_variance', ) pool5 = fluid.layers.pool2d( input=conv5_x_x, pool_size=7, pool_stride=1, pool_padding=0, - pool_type='avg') + pool_type='avg', ) - #stdv = 1.0 / math.sqrt(pool5.shape[1] * 1.0) stdv = 0.01 param_attr = fluid.param_attr.ParamAttr( initializer=fluid.initializer.Uniform(-stdv, stdv)) fc6 = fluid.layers.fc(input=pool5, size=class_dim, - param_attr=param_attr) + param_attr=param_attr, + name="fc6") return fc6 @@ -172,7 +201,8 @@ class DPN(object): num_1x1_c, inc, G, - _type='normal'): + _type='normal', + name=None): kw = 3 kh = 3 pw = (kw - 1) // 2 @@ -201,35 +231,50 @@ class DPN(object): num_filter=(num_1x1_c + 2 * inc), kernel=(1, 1), pad=(0, 0), - stride=(key_stride, key_stride)) + stride=(key_stride, key_stride), + name=name + "_match") data_o1, data_o2 = fluid.layers.split( - c1x1_w, num_or_sections=[num_1x1_c, 2 * inc], dim=1) + c1x1_w, + num_or_sections=[num_1x1_c, 2 * inc], + dim=1, + name=name + "_match_conv_Slice") else: data_o1 = data[0] data_o2 = data[1] # MAIN c1x1_a = self.bn_ac_conv( - data=data_in, num_filter=num_1x1_a, kernel=(1, 1), pad=(0, 0)) + data=data_in, + num_filter=num_1x1_a, + kernel=(1, 1), + pad=(0, 0), + name=name + "_conv1") c3x3_b = self.bn_ac_conv( data=c1x1_a, num_filter=num_3x3_b, kernel=(kw, kh), pad=(pw, ph), stride=(key_stride, key_stride), - num_group=G) + num_group=G, + name=name + "_conv2") c1x1_c = self.bn_ac_conv( data=c3x3_b, num_filter=(num_1x1_c + inc), kernel=(1, 1), - pad=(0, 0)) + pad=(0, 0), + name=name + "_conv3") c1x1_c1, c1x1_c2 = fluid.layers.split( - c1x1_c, num_or_sections=[num_1x1_c, inc], dim=1) + c1x1_c, + num_or_sections=[num_1x1_c, inc], + dim=1, + name=name + "_conv3_Slice") # OUTPUTS - summ = fluid.layers.elementwise_add(x=data_o1, y=c1x1_c1) - dense = fluid.layers.concat([data_o2, c1x1_c2], axis=1) + summ = fluid.layers.elementwise_add( + x=data_o1, y=c1x1_c1, name=name + "_elewise") + dense = fluid.layers.concat( + [data_o2, c1x1_c2], axis=1, name=name + "_concat") return [summ, dense] @@ -239,8 +284,17 @@ class DPN(object): kernel, pad, stride=(1, 1), - num_group=1): - bn_ac = fluid.layers.batch_norm(input=data, act='relu', is_test=False) + num_group=1, + name=None): + bn_ac = fluid.layers.batch_norm( + input=data, + act='relu', + is_test=False, + name=name + '.output.1', + param_attr=ParamAttr(name=name + '_bn_scale'), + bias_attr=ParamAttr(name + '_bn_offset'), + moving_mean_name=name + '_bn_mean', + moving_variance_name=name + '_bn_variance', ) bn_ac_conv = fluid.layers.conv2d( input=bn_ac, num_filters=num_filter, @@ -249,7 +303,8 @@ class DPN(object): padding=pad, groups=num_group, act=None, - bias_attr=False) + bias_attr=False, + param_attr=ParamAttr(name=name + "_weights")) return bn_ac_conv @@ -259,7 +314,7 @@ def DPN68(): def DPN92(): - model = DPN(layers=92) + onvodel = DPN(layers=92) return model diff --git a/fluid/PaddleCV/image_classification/models/googlenet.py b/fluid/PaddleCV/image_classification/models/googlenet.py index be52ed96fcb801cc4a7d69d61470dd5732ff044c..bd9040c53e61a48d9f5bff6683bec961d3f95583 100644 --- a/fluid/PaddleCV/image_classification/models/googlenet.py +++ b/fluid/PaddleCV/image_classification/models/googlenet.py @@ -3,6 +3,7 @@ from __future__ import division from __future__ import print_function import paddle import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr __all__ = ['GoogleNet'] @@ -29,11 +30,13 @@ class GoogleNet(): filter_size, stride=1, groups=1, - act=None): + act=None, + name=None): channels = input.shape[1] stdv = (3.0 / (filter_size**2 * channels))**0.5 - param_attr = fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)) + param_attr = ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=name + "_weights") conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -43,43 +46,63 @@ class GoogleNet(): groups=groups, act=act, param_attr=param_attr, - bias_attr=False) + bias_attr=False, + name=name) return conv - def xavier(self, channels, filter_size): + def xavier(self, channels, filter_size, name): stdv = (3.0 / (filter_size**2 * channels))**0.5 - param_attr = fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv)) + param_attr = ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=name + "_weights") + return param_attr - def inception(self, name, input, channels, filter1, filter3R, filter3, - filter5R, filter5, proj): + def inception(self, + input, + channels, + filter1, + filter3R, + filter3, + filter5R, + filter5, + proj, + name=None): conv1 = self.conv_layer( - input=input, num_filters=filter1, filter_size=1, stride=1, act=None) + input=input, + num_filters=filter1, + filter_size=1, + stride=1, + act=None, + name="inception_" + name + "_1x1") conv3r = self.conv_layer( input=input, num_filters=filter3R, filter_size=1, stride=1, - act=None) + act=None, + name="inception_" + name + "_3x3_reduce") conv3 = self.conv_layer( input=conv3r, num_filters=filter3, filter_size=3, stride=1, - act=None) + act=None, + name="inception_" + name + "_3x3") conv5r = self.conv_layer( input=input, num_filters=filter5R, filter_size=1, stride=1, - act=None) + act=None, + name="inception_" + name + "_5x5_reduce") conv5 = self.conv_layer( input=conv5r, num_filters=filter5, filter_size=5, stride=1, - act=None) + act=None, + name="inception_" + name + "_5x5") pool = fluid.layers.pool2d( input=input, pool_size=3, @@ -87,81 +110,124 @@ class GoogleNet(): pool_padding=1, pool_type='max') convprj = fluid.layers.conv2d( - input=pool, filter_size=1, num_filters=proj, stride=1, padding=0) + input=pool, + filter_size=1, + num_filters=proj, + stride=1, + padding=0, + name="inception_" + name + "_3x3_proj", + param_attr=ParamAttr( + name="inception_" + name + "_3x3_proj_weights"), + bias_attr=False) cat = fluid.layers.concat(input=[conv1, conv3, conv5, convprj], axis=1) cat = fluid.layers.relu(cat) return cat def net(self, input, class_dim=1000): conv = self.conv_layer( - input=input, num_filters=64, filter_size=7, stride=2, act=None) + input=input, + num_filters=64, + filter_size=7, + stride=2, + act=None, + name="conv1") pool = fluid.layers.pool2d( input=conv, pool_size=3, pool_type='max', pool_stride=2) conv = self.conv_layer( - input=pool, num_filters=64, filter_size=1, stride=1, act=None) + input=pool, + num_filters=64, + filter_size=1, + stride=1, + act=None, + name="conv2_1x1") conv = self.conv_layer( - input=conv, num_filters=192, filter_size=3, stride=1, act=None) + input=conv, + num_filters=192, + filter_size=3, + stride=1, + act=None, + name="conv2_3x3") pool = fluid.layers.pool2d( input=conv, pool_size=3, pool_type='max', pool_stride=2) - ince3a = self.inception("ince3a", pool, 192, 64, 96, 128, 16, 32, 32) - ince3b = self.inception("ince3b", ince3a, 256, 128, 128, 192, 32, 96, - 64) + ince3a = self.inception(pool, 192, 64, 96, 128, 16, 32, 32, "ince3a") + ince3b = self.inception(ince3a, 256, 128, 128, 192, 32, 96, 64, + "ince3b") pool3 = fluid.layers.pool2d( input=ince3b, pool_size=3, pool_type='max', pool_stride=2) - ince4a = self.inception("ince4a", pool3, 480, 192, 96, 208, 16, 48, 64) - ince4b = self.inception("ince4b", ince4a, 512, 160, 112, 224, 24, 64, - 64) - ince4c = self.inception("ince4c", ince4b, 512, 128, 128, 256, 24, 64, - 64) - ince4d = self.inception("ince4d", ince4c, 512, 112, 144, 288, 32, 64, - 64) - ince4e = self.inception("ince4e", ince4d, 528, 256, 160, 320, 32, 128, - 128) + ince4a = self.inception(pool3, 480, 192, 96, 208, 16, 48, 64, "ince4a") + ince4b = self.inception(ince4a, 512, 160, 112, 224, 24, 64, 64, + "ince4b") + ince4c = self.inception(ince4b, 512, 128, 128, 256, 24, 64, 64, + "ince4c") + ince4d = self.inception(ince4c, 512, 112, 144, 288, 32, 64, 64, + "ince4d") + ince4e = self.inception(ince4d, 528, 256, 160, 320, 32, 128, 128, + "ince4e") pool4 = fluid.layers.pool2d( input=ince4e, pool_size=3, pool_type='max', pool_stride=2) - ince5a = self.inception("ince5a", pool4, 832, 256, 160, 320, 32, 128, - 128) - ince5b = self.inception("ince5b", ince5a, 832, 384, 192, 384, 48, 128, - 128) + ince5a = self.inception(pool4, 832, 256, 160, 320, 32, 128, 128, + "ince5a") + ince5b = self.inception(ince5a, 832, 384, 192, 384, 48, 128, 128, + "ince5b") pool5 = fluid.layers.pool2d( input=ince5b, pool_size=7, pool_type='avg', pool_stride=7) dropout = fluid.layers.dropout(x=pool5, dropout_prob=0.4) out = fluid.layers.fc(input=dropout, size=class_dim, act='softmax', - param_attr=self.xavier(1024, 1)) + param_attr=self.xavier(1024, 1, "out"), + name="out", + bias_attr=ParamAttr(name="out_offset")) pool_o1 = fluid.layers.pool2d( input=ince4a, pool_size=5, pool_type='avg', pool_stride=3) conv_o1 = self.conv_layer( - input=pool_o1, num_filters=128, filter_size=1, stride=1, act=None) + input=pool_o1, + num_filters=128, + filter_size=1, + stride=1, + act=None, + name="conv_o1") fc_o1 = fluid.layers.fc(input=conv_o1, size=1024, act='relu', - param_attr=self.xavier(2048, 1)) + param_attr=self.xavier(2048, 1, "fc_o1"), + name="fc_o1", + bias_attr=ParamAttr(name="fc_o1_offset")) dropout_o1 = fluid.layers.dropout(x=fc_o1, dropout_prob=0.7) out1 = fluid.layers.fc(input=dropout_o1, size=class_dim, act='softmax', - param_attr=self.xavier(1024, 1)) + param_attr=self.xavier(1024, 1, "out1"), + name="out1", + bias_attr=ParamAttr(name="out1_offset")) pool_o2 = fluid.layers.pool2d( input=ince4d, pool_size=5, pool_type='avg', pool_stride=3) conv_o2 = self.conv_layer( - input=pool_o2, num_filters=128, filter_size=1, stride=1, act=None) + input=pool_o2, + num_filters=128, + filter_size=1, + stride=1, + act=None, + name="conv_o2") fc_o2 = fluid.layers.fc(input=conv_o2, size=1024, act='relu', - param_attr=self.xavier(2048, 1)) + param_attr=self.xavier(2048, 1, "fc_o2"), + name="fc_o2", + bias_attr=ParamAttr(name="fc_o2_offset")) dropout_o2 = fluid.layers.dropout(x=fc_o2, dropout_prob=0.7) out2 = fluid.layers.fc(input=dropout_o2, size=class_dim, act='softmax', - param_attr=self.xavier(1024, 1)) + param_attr=self.xavier(1024, 1, "out2"), + name="out2", + bias_attr=ParamAttr(name="out2_offset")) # last fc layer is "out" return out, out1, out2 diff --git a/fluid/PaddleCV/image_classification/models/inception_v4.py b/fluid/PaddleCV/image_classification/models/inception_v4.py index 1520375477ade6e61f0a5584278b13e40ab541eb..8c6c0dbb129f903b4f0b849f930a520b5f17e5db 100644 --- a/fluid/PaddleCV/image_classification/models/inception_v4.py +++ b/fluid/PaddleCV/image_classification/models/inception_v4.py @@ -4,6 +4,7 @@ from __future__ import print_function import paddle import paddle.fluid as fluid import math +from paddle.fluid.param_attr import ParamAttr __all__ = ['InceptionV4'] @@ -28,15 +29,15 @@ class InceptionV4(): x = self.inception_stem(input) for i in range(4): - x = self.inceptionA(x) + x = self.inceptionA(x, name=str(i + 1)) x = self.reductionA(x) for i in range(7): - x = self.inceptionB(x) + x = self.inceptionB(x, name=str(i + 1)) x = self.reductionB(x) for i in range(3): - x = self.inceptionC(x) + x = self.inceptionC(x, name=str(i + 1)) pool = fluid.layers.pool2d( input=x, pool_size=8, pool_type='avg', global_pooling=True) @@ -47,8 +48,12 @@ class InceptionV4(): out = fluid.layers.fc( input=drop, size=class_dim, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + param_attr=ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name="final_fc_weights"), + bias_attr=ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name="final_fc_offset")) return out def conv_bn_layer(self, @@ -58,7 +63,8 @@ class InceptionV4(): stride=1, padding=0, groups=1, - act='relu'): + act='relu', + name=None): conv = fluid.layers.conv2d( input=data, num_filters=num_filters, @@ -67,32 +73,58 @@ class InceptionV4(): padding=padding, groups=groups, act=None, - bias_attr=False) - return fluid.layers.batch_norm(input=conv, act=act) - - def inception_stem(self, data): - conv = self.conv_bn_layer(data, 32, 3, stride=2, act='relu') - conv = self.conv_bn_layer(conv, 32, 3, act='relu') - conv = self.conv_bn_layer(conv, 64, 3, padding=1, act='relu') + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name) + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + name=bn_name, + param_attr=ParamAttr(name=bn_name + "_scale"), + bias_attr=ParamAttr(name=bn_name + "_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') + + def inception_stem(self, data, name=None): + conv = self.conv_bn_layer( + data, 32, 3, stride=2, act='relu', name="conv1_3x3_s2") + conv = self.conv_bn_layer(conv, 32, 3, act='relu', name="conv2_3x3_s1") + conv = self.conv_bn_layer( + conv, 64, 3, padding=1, act='relu', name="conv3_3x3_s1") pool1 = fluid.layers.pool2d( input=conv, pool_size=3, pool_stride=2, pool_type='max') - conv2 = self.conv_bn_layer(conv, 96, 3, stride=2, act='relu') + conv2 = self.conv_bn_layer( + conv, 96, 3, stride=2, act='relu', name="inception_stem1_3x3_s2") concat = fluid.layers.concat([pool1, conv2], axis=1) - conv1 = self.conv_bn_layer(concat, 64, 1, act='relu') - conv1 = self.conv_bn_layer(conv1, 96, 3, act='relu') + conv1 = self.conv_bn_layer( + concat, 64, 1, act='relu', name="inception_stem2_3x3_reduce") + conv1 = self.conv_bn_layer( + conv1, 96, 3, act='relu', name="inception_stem2_3x3") - conv2 = self.conv_bn_layer(concat, 64, 1, act='relu') conv2 = self.conv_bn_layer( - conv2, 64, (7, 1), padding=(3, 0), act='relu') + concat, 64, 1, act='relu', name="inception_stem2_1x7_reduce") + conv2 = self.conv_bn_layer( + conv2, + 64, (7, 1), + padding=(3, 0), + act='relu', + name="inception_stem2_1x7") conv2 = self.conv_bn_layer( - conv2, 64, (1, 7), padding=(0, 3), act='relu') - conv2 = self.conv_bn_layer(conv2, 96, 3, act='relu') + conv2, + 64, (1, 7), + padding=(0, 3), + act='relu', + name="inception_stem2_7x1") + conv2 = self.conv_bn_layer( + conv2, 96, 3, act='relu', name="inception_stem2_3x3_2") concat = fluid.layers.concat([conv1, conv2], axis=1) - conv1 = self.conv_bn_layer(concat, 192, 3, stride=2, act='relu') + conv1 = self.conv_bn_layer( + concat, 192, 3, stride=2, act='relu', name="inception_stem3_3x3_s2") pool1 = fluid.layers.pool2d( input=concat, pool_size=3, pool_stride=2, pool_type='max') @@ -100,105 +132,207 @@ class InceptionV4(): return concat - def inceptionA(self, data): + def inceptionA(self, data, name=None): pool1 = fluid.layers.pool2d( input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer(pool1, 96, 1, act='relu') + conv1 = self.conv_bn_layer( + pool1, 96, 1, act='relu', name="inception_a" + name + "_1x1") - conv2 = self.conv_bn_layer(data, 96, 1, act='relu') + conv2 = self.conv_bn_layer( + data, 96, 1, act='relu', name="inception_a" + name + "_1x1_2") - conv3 = self.conv_bn_layer(data, 64, 1, act='relu') - conv3 = self.conv_bn_layer(conv3, 96, 3, padding=1, act='relu') + conv3 = self.conv_bn_layer( + data, 64, 1, act='relu', name="inception_a" + name + "_3x3_reduce") + conv3 = self.conv_bn_layer( + conv3, + 96, + 3, + padding=1, + act='relu', + name="inception_a" + name + "_3x3") - conv4 = self.conv_bn_layer(data, 64, 1, act='relu') - conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu') - conv4 = self.conv_bn_layer(conv4, 96, 3, padding=1, act='relu') + conv4 = self.conv_bn_layer( + data, + 64, + 1, + act='relu', + name="inception_a" + name + "_3x3_2_reduce") + conv4 = self.conv_bn_layer( + conv4, + 96, + 3, + padding=1, + act='relu', + name="inception_a" + name + "_3x3_2") + conv4 = self.conv_bn_layer( + conv4, + 96, + 3, + padding=1, + act='relu', + name="inception_a" + name + "_3x3_3") concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) return concat - def reductionA(self, data): + def reductionA(self, data, name=None): pool1 = fluid.layers.pool2d( input=data, pool_size=3, pool_stride=2, pool_type='max') - conv2 = self.conv_bn_layer(data, 384, 3, stride=2, act='relu') + conv2 = self.conv_bn_layer( + data, 384, 3, stride=2, act='relu', name="reduction_a_3x3") - conv3 = self.conv_bn_layer(data, 192, 1, act='relu') - conv3 = self.conv_bn_layer(conv3, 224, 3, padding=1, act='relu') - conv3 = self.conv_bn_layer(conv3, 256, 3, stride=2, act='relu') + conv3 = self.conv_bn_layer( + data, 192, 1, act='relu', name="reduction_a_3x3_2_reduce") + conv3 = self.conv_bn_layer( + conv3, 224, 3, padding=1, act='relu', name="reduction_a_3x3_2") + conv3 = self.conv_bn_layer( + conv3, 256, 3, stride=2, act='relu', name="reduction_a_3x3_3") concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) return concat - def inceptionB(self, data): + def inceptionB(self, data, name=None): pool1 = fluid.layers.pool2d( input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer(pool1, 128, 1, act='relu') + conv1 = self.conv_bn_layer( + pool1, 128, 1, act='relu', name="inception_b" + name + "_1x1") - conv2 = self.conv_bn_layer(data, 384, 1, act='relu') + conv2 = self.conv_bn_layer( + data, 384, 1, act='relu', name="inception_b" + name + "_1x1_2") - conv3 = self.conv_bn_layer(data, 192, 1, act='relu') conv3 = self.conv_bn_layer( - conv3, 224, (1, 7), padding=(0, 3), act='relu') + data, 192, 1, act='relu', name="inception_b" + name + "_1x7_reduce") conv3 = self.conv_bn_layer( - conv3, 256, (7, 1), padding=(3, 0), act='relu') + conv3, + 224, (1, 7), + padding=(0, 3), + act='relu', + name="inception_b" + name + "_1x7") + conv3 = self.conv_bn_layer( + conv3, + 256, (7, 1), + padding=(3, 0), + act='relu', + name="inception_b" + name + "_7x1") - conv4 = self.conv_bn_layer(data, 192, 1, act='relu') conv4 = self.conv_bn_layer( - conv4, 192, (1, 7), padding=(0, 3), act='relu') + data, + 192, + 1, + act='relu', + name="inception_b" + name + "_7x1_2_reduce") + conv4 = self.conv_bn_layer( + conv4, + 192, (1, 7), + padding=(0, 3), + act='relu', + name="inception_b" + name + "_1x7_2") conv4 = self.conv_bn_layer( - conv4, 224, (7, 1), padding=(3, 0), act='relu') + conv4, + 224, (7, 1), + padding=(3, 0), + act='relu', + name="inception_b" + name + "_7x1_2") conv4 = self.conv_bn_layer( - conv4, 224, (1, 7), padding=(0, 3), act='relu') + conv4, + 224, (1, 7), + padding=(0, 3), + act='relu', + name="inception_b" + name + "_1x7_3") conv4 = self.conv_bn_layer( - conv4, 256, (7, 1), padding=(3, 0), act='relu') + conv4, + 256, (7, 1), + padding=(3, 0), + act='relu', + name="inception_b" + name + "_7x1_3") concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) return concat - def reductionB(self, data): + def reductionB(self, data, name=None): pool1 = fluid.layers.pool2d( input=data, pool_size=3, pool_stride=2, pool_type='max') - conv2 = self.conv_bn_layer(data, 192, 1, act='relu') - conv2 = self.conv_bn_layer(conv2, 192, 3, stride=2, act='relu') + conv2 = self.conv_bn_layer( + data, 192, 1, act='relu', name="reduction_b_3x3_reduce") + conv2 = self.conv_bn_layer( + conv2, 192, 3, stride=2, act='relu', name="reduction_b_3x3") - conv3 = self.conv_bn_layer(data, 256, 1, act='relu') conv3 = self.conv_bn_layer( - conv3, 256, (1, 7), padding=(0, 3), act='relu') + data, 256, 1, act='relu', name="reduction_b_1x7_reduce") + conv3 = self.conv_bn_layer( + conv3, + 256, (1, 7), + padding=(0, 3), + act='relu', + name="reduction_b_1x7") + conv3 = self.conv_bn_layer( + conv3, + 320, (7, 1), + padding=(3, 0), + act='relu', + name="reduction_b_7x1") conv3 = self.conv_bn_layer( - conv3, 320, (7, 1), padding=(3, 0), act='relu') - conv3 = self.conv_bn_layer(conv3, 320, 3, stride=2, act='relu') + conv3, 320, 3, stride=2, act='relu', name="reduction_b_3x3_2") concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) return concat - def inceptionC(self, data): + def inceptionC(self, data, name=None): pool1 = fluid.layers.pool2d( input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer(pool1, 256, 1, act='relu') + conv1 = self.conv_bn_layer( + pool1, 256, 1, act='relu', name="inception_c" + name + "_1x1") - conv2 = self.conv_bn_layer(data, 256, 1, act='relu') + conv2 = self.conv_bn_layer( + data, 256, 1, act='relu', name="inception_c" + name + "_1x1_2") - conv3 = self.conv_bn_layer(data, 384, 1, act='relu') + conv3 = self.conv_bn_layer( + data, 384, 1, act='relu', name="inception_c" + name + "_1x1_3") conv3_1 = self.conv_bn_layer( - conv3, 256, (1, 3), padding=(0, 1), act='relu') + conv3, + 256, (1, 3), + padding=(0, 1), + act='relu', + name="inception_c" + name + "_1x3") conv3_2 = self.conv_bn_layer( - conv3, 256, (3, 1), padding=(1, 0), act='relu') + conv3, + 256, (3, 1), + padding=(1, 0), + act='relu', + name="inception_c" + name + "_3x1") - conv4 = self.conv_bn_layer(data, 384, 1, act='relu') conv4 = self.conv_bn_layer( - conv4, 448, (1, 3), padding=(0, 1), act='relu') + data, 384, 1, act='relu', name="inception_c" + name + "_1x1_4") + conv4 = self.conv_bn_layer( + conv4, + 448, (1, 3), + padding=(0, 1), + act='relu', + name="inception_c" + name + "_1x3_2") conv4 = self.conv_bn_layer( - conv4, 512, (3, 1), padding=(1, 0), act='relu') + conv4, + 512, (3, 1), + padding=(1, 0), + act='relu', + name="inception_c" + name + "_3x1_2") conv4_1 = self.conv_bn_layer( - conv4, 256, (1, 3), padding=(0, 1), act='relu') + conv4, + 256, (1, 3), + padding=(0, 1), + act='relu', + name="inception_c" + name + "_1x3_3") conv4_2 = self.conv_bn_layer( - conv4, 256, (3, 1), padding=(1, 0), act='relu') + conv4, + 256, (3, 1), + padding=(1, 0), + act='relu', + name="inception_c" + name + "_3x1_3") concat = fluid.layers.concat( [conv1, conv2, conv3_1, conv3_2, conv4_1, conv4_2], axis=1) diff --git a/fluid/PaddleCV/image_classification/models/mobilenet.py b/fluid/PaddleCV/image_classification/models/mobilenet.py index d0b419e8b4083104ba529c9f886284aa724953e6..d242bc946a7b4bec9c9d2e34da2496c0901ba870 100644 --- a/fluid/PaddleCV/image_classification/models/mobilenet.py +++ b/fluid/PaddleCV/image_classification/models/mobilenet.py @@ -32,7 +32,8 @@ class MobileNet(): channels=3, num_filters=int(32 * scale), stride=2, - padding=1) + padding=1, + name="conv1") # 56x56 input = self.depthwise_separable( @@ -41,7 +42,8 @@ class MobileNet(): num_filters2=64, num_groups=32, stride=1, - scale=scale) + scale=scale, + name="conv2_1") input = self.depthwise_separable( input, @@ -49,7 +51,8 @@ class MobileNet(): num_filters2=128, num_groups=64, stride=2, - scale=scale) + scale=scale, + name="conv2_2") # 28x28 input = self.depthwise_separable( @@ -58,7 +61,8 @@ class MobileNet(): num_filters2=128, num_groups=128, stride=1, - scale=scale) + scale=scale, + name="conv3_1") input = self.depthwise_separable( input, @@ -66,7 +70,8 @@ class MobileNet(): num_filters2=256, num_groups=128, stride=2, - scale=scale) + scale=scale, + name="conv3_2") # 14x14 input = self.depthwise_separable( @@ -75,7 +80,8 @@ class MobileNet(): num_filters2=256, num_groups=256, stride=1, - scale=scale) + scale=scale, + name="conv4_1") input = self.depthwise_separable( input, @@ -83,7 +89,8 @@ class MobileNet(): num_filters2=512, num_groups=256, stride=2, - scale=scale) + scale=scale, + name="conv4_2") # 14x14 for i in range(5): @@ -93,7 +100,8 @@ class MobileNet(): num_filters2=512, num_groups=512, stride=1, - scale=scale) + scale=scale, + name="conv5" + "_" + str(i + 1)) # 7x7 input = self.depthwise_separable( input, @@ -101,7 +109,8 @@ class MobileNet(): num_filters2=1024, num_groups=512, stride=2, - scale=scale) + scale=scale, + name="conv5_6") input = self.depthwise_separable( input, @@ -109,7 +118,8 @@ class MobileNet(): num_filters2=1024, num_groups=1024, stride=1, - scale=scale) + scale=scale, + name="conv6") input = fluid.layers.pool2d( input=input, @@ -120,7 +130,9 @@ class MobileNet(): output = fluid.layers.fc(input=input, size=class_dim, - param_attr=ParamAttr(initializer=MSRA())) + param_attr=ParamAttr( + initializer=MSRA(), name="fc7_weights"), + bias_attr=ParamAttr(name="fc7_offset")) return output def conv_bn_layer(self, @@ -132,7 +144,8 @@ class MobileNet(): channels=None, num_groups=1, act='relu', - use_cudnn=True): + use_cudnn=True, + name=None): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -142,12 +155,26 @@ class MobileNet(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr(initializer=MSRA()), + param_attr=ParamAttr( + initializer=MSRA(), name=name + "_weights"), bias_attr=False) - return fluid.layers.batch_norm(input=conv, act=act) - - def depthwise_separable(self, input, num_filters1, num_filters2, num_groups, - stride, scale): + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + param_attr=ParamAttr(name=bn_name + "_scale"), + bias_attr=ParamAttr(name=bn_name + "_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') + + def depthwise_separable(self, + input, + num_filters1, + num_filters2, + num_groups, + stride, + scale, + name=None): depthwise_conv = self.conv_bn_layer( input=input, filter_size=3, @@ -155,12 +182,14 @@ class MobileNet(): stride=stride, padding=1, num_groups=int(num_groups * scale), - use_cudnn=False) + use_cudnn=False, + name=name + "_dw") pointwise_conv = self.conv_bn_layer( input=depthwise_conv, filter_size=1, num_filters=int(num_filters2 * scale), stride=1, - padding=0) + padding=0, + name=name + "_sep") return pointwise_conv diff --git a/fluid/PaddleCV/image_classification/models/mobilenet_v2.py b/fluid/PaddleCV/image_classification/models/mobilenet_v2.py index c219b1bf5a7260fbb07627bc3fa039f4b2833092..77d88c7da625c0c953c75d229148868f0481f2a2 100644 --- a/fluid/PaddleCV/image_classification/models/mobilenet_v2.py +++ b/fluid/PaddleCV/image_classification/models/mobilenet_v2.py @@ -36,33 +36,40 @@ class MobileNetV2(): (6, 320, 1, 1), ] + #conv1 input = self.conv_bn_layer( input, num_filters=int(32 * scale), filter_size=3, stride=2, padding=1, - if_act=True) + if_act=True, + name='conv1_1') + # bottleneck sequences + i = 1 in_c = int(32 * scale) for layer_setting in bottleneck_params_list: t, c, n, s = layer_setting + i += 1 input = self.invresi_blocks( input=input, in_c=in_c, t=t, c=int(c * scale), n=n, - s=s, ) + s=s, + name='conv' + str(i)) in_c = int(c * scale) - + #last_conv input = self.conv_bn_layer( input=input, num_filters=int(1280 * scale) if scale > 1.0 else 1280, filter_size=1, stride=1, padding=0, - if_act=True) + if_act=True, + name='conv9') input = fluid.layers.pool2d( input=input, @@ -73,7 +80,8 @@ class MobileNetV2(): output = fluid.layers.fc(input=input, size=class_dim, - param_attr=ParamAttr(initializer=MSRA())) + param_attr=ParamAttr(name='fc10_weights'), + bias_attr=ParamAttr(name='fc10_offset')) return output def conv_bn_layer(self, @@ -84,8 +92,9 @@ class MobileNetV2(): padding, channels=None, num_groups=1, - use_cudnn=True, - if_act=True): + if_act=True, + name=None, + use_cudnn=True): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -95,9 +104,15 @@ class MobileNetV2(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr(initializer=MSRA()), + param_attr=ParamAttr(name=name + '_weights'), bias_attr=False) - bn = fluid.layers.batch_norm(input=conv) + bn_name = name + '_bn' + bn = fluid.layers.batch_norm( + input=conv, + param_attr=ParamAttr(name=bn_name + "_scale"), + bias_attr=ParamAttr(name=bn_name + "_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') if if_act: return fluid.layers.relu6(bn) else: @@ -106,10 +121,18 @@ class MobileNetV2(): def shortcut(self, input, data_residual): return fluid.layers.elementwise_add(input, data_residual) - def inverted_residual_unit(self, input, num_in_filter, num_filters, - ifshortcut, stride, filter_size, padding, - expansion_factor): + def inverted_residual_unit(self, + input, + num_in_filter, + num_filters, + ifshortcut, + stride, + filter_size, + padding, + expansion_factor, + name=None): num_expfilter = int(round(num_in_filter * expansion_factor)) + channel_expand = self.conv_bn_layer( input=input, num_filters=num_expfilter, @@ -117,7 +140,9 @@ class MobileNetV2(): stride=1, padding=0, num_groups=1, - if_act=True) + if_act=True, + name=name + '_expand') + bottleneck_conv = self.conv_bn_layer( input=channel_expand, num_filters=num_expfilter, @@ -126,7 +151,9 @@ class MobileNetV2(): padding=padding, num_groups=num_expfilter, if_act=True, + name=name + '_dwise', use_cudnn=False) + linear_out = self.conv_bn_layer( input=bottleneck_conv, num_filters=num_filters, @@ -134,14 +161,15 @@ class MobileNetV2(): stride=1, padding=0, num_groups=1, - if_act=False) + if_act=False, + name=name + '_linear') if ifshortcut: out = self.shortcut(input=input, data_residual=linear_out) return out else: return linear_out - def invresi_blocks(self, input, in_c, t, c, n, s): + def invresi_blocks(self, input, in_c, t, c, n, s, name=None): first_block = self.inverted_residual_unit( input=input, num_in_filter=in_c, @@ -150,7 +178,8 @@ class MobileNetV2(): stride=s, filter_size=3, padding=1, - expansion_factor=t) + expansion_factor=t, + name=name + '_1') last_residual_block = first_block last_c = c @@ -164,5 +193,6 @@ class MobileNetV2(): stride=1, filter_size=3, padding=1, - expansion_factor=t) + expansion_factor=t, + name=name + '_' + str(i + 1)) return last_residual_block diff --git a/fluid/PaddleCV/image_classification/models/resnet.py b/fluid/PaddleCV/image_classification/models/resnet.py index def99db6d84673b77582cf93374f4cb2f00e9ac5..d99181e82d909008fc5fc2aafea55439463e4820 100644 --- a/fluid/PaddleCV/image_classification/models/resnet.py +++ b/fluid/PaddleCV/image_classification/models/resnet.py @@ -4,8 +4,9 @@ from __future__ import print_function import paddle import paddle.fluid as fluid import math +from paddle.fluid.param_attr import ParamAttr -__all__ = ["ResNet", "ResNet50", "ResNet101", "ResNet152"] +__all__ = ["ResNet", "ResNet18", "ResNet34", "ResNet50", "ResNet101", "ResNet152"] train_parameters = { "input_size": [3, 224, 224], @@ -27,11 +28,13 @@ class ResNet(): def net(self, input, class_dim=1000): layers = self.layers - supported_layers = [50, 101, 152] + supported_layers = [18, 34, 50, 101, 152] assert layers in supported_layers, \ "supported layers are {} but input layer is {}".format(supported_layers, layers) - if layers == 50: + if layers == 18: + depth = [2, 2, 2, 2] + elif layers == 34 or layers == 50: depth = [3, 4, 6, 3] elif layers == 101: depth = [3, 4, 23, 3] @@ -40,29 +43,53 @@ class ResNet(): num_filters = [64, 128, 256, 512] conv = self.conv_bn_layer( - input=input, num_filters=64, filter_size=7, stride=2, act='relu') + input=input, num_filters=64, filter_size=7, stride=2, act='relu',name="conv1") conv = fluid.layers.pool2d( input=conv, pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') - - for block in range(len(depth)): - for i in range(depth[block]): - conv = self.bottleneck_block( - input=conv, - num_filters=num_filters[block], - stride=2 if i == 0 and block != 0 else 1) - - pool = fluid.layers.pool2d( - input=conv, pool_size=7, pool_type='avg', global_pooling=True) - stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) - out = fluid.layers.fc(input=pool, - size=class_dim, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, - stdv))) + if layers >= 50: + for block in range(len(depth)): + for i in range(depth[block]): + if layers in [101, 152] and block == 2: + if i == 0: + conv_name="res"+str(block+2)+"a" + else: + conv_name="res"+str(block+2)+"b"+str(i) + else: + conv_name="res"+str(block+2)+chr(97+i) + conv = self.bottleneck_block( + input=conv, + num_filters=num_filters[block], + stride=2 if i == 0 and block != 0 else 1, name=conv_name) + + pool = fluid.layers.pool2d( + input=conv, pool_size=7, pool_type='avg', global_pooling=True) + stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) + out = fluid.layers.fc(input=pool, + size=class_dim, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv))) + else: + for block in range(len(depth)): + for i in range(depth[block]): + conv_name="res"+str(block+2)+chr(97+i) + conv = self.basic_block( + input=conv, + num_filters=num_filters[block], + stride=2 if i == 0 and block != 0 else 1, + is_first=block==i==0, + name=conv_name) + + pool = fluid.layers.pool2d( + input=conv, pool_size=7, pool_type='avg', global_pooling=True) + stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) + out = fluid.layers.fc(input=pool, + size=class_dim, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv))) return out def conv_bn_layer(self, @@ -71,40 +98,73 @@ class ResNet(): filter_size, stride=1, groups=1, - act=None): + act=None, + name=None): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, filter_size=filter_size, stride=stride, - padding=(filter_size - 1) // 2, + padding=(filter_size - 1) / 2, groups=groups, act=None, - bias_attr=False) - return fluid.layers.batch_norm(input=conv, act=act) - - def shortcut(self, input, ch_out, stride): + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name + '.conv2d.output.1') + + if name == "conv1": + bn_name = "bn_" + name + else: + bn_name = "bn" + name[3:] + return fluid.layers.batch_norm(input=conv, + act=act, + name=bn_name+'.output.1', + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance',) + + def shortcut(self, input, ch_out, stride, is_first, name): ch_in = input.shape[1] - if ch_in != ch_out or stride != 1: - return self.conv_bn_layer(input, ch_out, 1, stride) + if ch_in != ch_out or stride != 1 or is_first == True: + return self.conv_bn_layer(input, ch_out, 1, stride, name=name) else: return input - def bottleneck_block(self, input, num_filters, stride): + def bottleneck_block(self, input, num_filters, stride, name): conv0 = self.conv_bn_layer( - input=input, num_filters=num_filters, filter_size=1, act='relu') + input=input, num_filters=num_filters, filter_size=1, act='relu',name=name+"_branch2a") conv1 = self.conv_bn_layer( input=conv0, num_filters=num_filters, filter_size=3, stride=stride, - act='relu') + act='relu', + name=name+"_branch2b") conv2 = self.conv_bn_layer( - input=conv1, num_filters=num_filters * 4, filter_size=1, act=None) + input=conv1, num_filters=num_filters * 4, filter_size=1, act=None, name=name+"_branch2c") + + short = self.shortcut(input, num_filters * 4, stride, is_first=False, name=name + "_branch1") + + return fluid.layers.elementwise_add(x=short, y=conv2, act='relu',name=name+".add.output.5") + + def basic_block(self, input, num_filters, stride, is_first, name): + conv0 = self.conv_bn_layer(input=input, num_filters=num_filters, filter_size=3, act='relu', stride=stride, + name=name+"_branch2a") + conv1 = self.conv_bn_layer(input=conv0, num_filters=num_filters, filter_size=3, act=None, + name=name+"_branch2b") + short = self.shortcut(input, num_filters, stride, is_first, name=name + "_branch1") + return fluid.layers.elementwise_add(x=short, y=conv1, act='relu') + + +def ResNet18(): + model = ResNet(layers=18) + return model - short = self.shortcut(input, num_filters * 4, stride) - return fluid.layers.elementwise_add(x=short, y=conv2, act='relu') +def ResNet34(): + model = ResNet(layers=34) + return model def ResNet50(): diff --git a/fluid/PaddleCV/image_classification/models/se_resnext.py b/fluid/PaddleCV/image_classification/models/se_resnext.py index ac50bd87b5070000a018949e777a897427c3e5a5..0ae3d66fddbe2d1b9da5e2f52fe80d15931d256d 100644 --- a/fluid/PaddleCV/image_classification/models/se_resnext.py +++ b/fluid/PaddleCV/image_classification/models/se_resnext.py @@ -4,6 +4,7 @@ from __future__ import print_function import paddle import paddle.fluid as fluid import math +from paddle.fluid.param_attr import ParamAttr __all__ = [ "SE_ResNeXt", "SE_ResNeXt50_32x4d", "SE_ResNeXt101_32x4d", @@ -18,7 +19,7 @@ train_parameters = { "learning_strategy": { "name": "piecewise_decay", "batch_size": 256, - "epochs": [30, 60, 90], + "epochs": [40, 80, 100], "steps": [0.1, 0.01, 0.001, 0.0001] } } @@ -45,7 +46,8 @@ class SE_ResNeXt(): num_filters=64, filter_size=7, stride=2, - act='relu') + act='relu', + name='conv1', ) conv = fluid.layers.pool2d( input=conv, pool_size=3, @@ -63,7 +65,8 @@ class SE_ResNeXt(): num_filters=64, filter_size=7, stride=2, - act='relu') + act='relu', + name="conv1", ) conv = fluid.layers.pool2d( input=conv, pool_size=3, @@ -81,67 +84,94 @@ class SE_ResNeXt(): num_filters=64, filter_size=3, stride=2, - act='relu') + act='relu', + name='conv1') conv = self.conv_bn_layer( - input=conv, num_filters=64, filter_size=3, stride=1, act='relu') + input=conv, + num_filters=64, + filter_size=3, + stride=1, + act='relu', + name='conv2') conv = self.conv_bn_layer( input=conv, num_filters=128, filter_size=3, stride=1, - act='relu') + act='relu', + name='conv3') conv = fluid.layers.pool2d( input=conv, pool_size=3, pool_stride=2, pool_padding=1, \ pool_type='max') - + n = 1 if layers == 50 or layers == 101 else 3 for block in range(len(depth)): + n += 1 for i in range(depth[block]): conv = self.bottleneck_block( input=conv, num_filters=num_filters[block], stride=2 if i == 0 and block != 0 else 1, cardinality=cardinality, - reduction_ratio=reduction_ratio) + reduction_ratio=reduction_ratio, + name=str(n) + '_' + str(i + 1)) pool = fluid.layers.pool2d( input=conv, pool_size=7, pool_type='avg', global_pooling=True) drop = fluid.layers.dropout( x=pool, dropout_prob=0.5, seed=self.params['dropout_seed']) stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0) - out = fluid.layers.fc(input=drop, - size=class_dim, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, - stdv))) + out = fluid.layers.fc( + input=drop, + size=class_dim, + param_attr=ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name='fc6_weights'), + bias_attr=ParamAttr(name='fc6_offset')) return out - def shortcut(self, input, ch_out, stride): + def shortcut(self, input, ch_out, stride, name): ch_in = input.shape[1] if ch_in != ch_out or stride != 1: filter_size = 1 - return self.conv_bn_layer(input, ch_out, filter_size, stride) + return self.conv_bn_layer( + input, ch_out, filter_size, stride, name='conv' + name + '_prj') else: return input - def bottleneck_block(self, input, num_filters, stride, cardinality, - reduction_ratio): + def bottleneck_block(self, + input, + num_filters, + stride, + cardinality, + reduction_ratio, + name=None): conv0 = self.conv_bn_layer( - input=input, num_filters=num_filters, filter_size=1, act='relu') + input=input, + num_filters=num_filters, + filter_size=1, + act='relu', + name='conv' + name + '_x1') conv1 = self.conv_bn_layer( input=conv0, num_filters=num_filters, filter_size=3, stride=stride, groups=cardinality, - act='relu') + act='relu', + name='conv' + name + '_x2') conv2 = self.conv_bn_layer( - input=conv1, num_filters=num_filters * 2, filter_size=1, act=None) + input=conv1, + num_filters=num_filters * 2, + filter_size=1, + act=None, + name='conv' + name + '_x3') scale = self.squeeze_excitation( input=conv2, num_channels=num_filters * 2, - reduction_ratio=reduction_ratio) + reduction_ratio=reduction_ratio, + name='fc' + name) - short = self.shortcut(input, num_filters * 2, stride) + short = self.shortcut(input, num_filters * 2, stride, name=name) return fluid.layers.elementwise_add(x=short, y=scale, act='relu') @@ -151,7 +181,8 @@ class SE_ResNeXt(): filter_size, stride=1, groups=1, - act=None): + act=None, + name=None): conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -160,26 +191,42 @@ class SE_ResNeXt(): padding=(filter_size - 1) // 2, groups=groups, act=None, - bias_attr=False) - return fluid.layers.batch_norm(input=conv, act=act) - - def squeeze_excitation(self, input, num_channels, reduction_ratio): + bias_attr=False, + param_attr=ParamAttr(name=name + '_weights'), ) + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') + + def squeeze_excitation(self, + input, + num_channels, + reduction_ratio, + name=None): pool = fluid.layers.pool2d( input=input, pool_size=0, pool_type='avg', global_pooling=True) stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) - squeeze = fluid.layers.fc(input=pool, - size=num_channels // reduction_ratio, - act='relu', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform( - -stdv, stdv))) + squeeze = fluid.layers.fc( + input=pool, + size=num_channels // reduction_ratio, + act='relu', + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=name + '_sqz_weights'), + bias_attr=ParamAttr(name=name + '_sqz_offset')) stdv = 1.0 / math.sqrt(squeeze.shape[1] * 1.0) - excitation = fluid.layers.fc(input=squeeze, - size=num_channels, - act='sigmoid', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform( - -stdv, stdv))) + excitation = fluid.layers.fc( + input=squeeze, + size=num_channels, + act='sigmoid', + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv), + name=name + '_exc_weights'), + bias_attr=ParamAttr(name=name + '_exc_offset')) scale = fluid.layers.elementwise_mul(x=input, y=excitation, axis=0) return scale diff --git a/fluid/PaddleCV/image_classification/models/shufflenet_v2.py b/fluid/PaddleCV/image_classification/models/shufflenet_v2.py index 6db88aa769dd6b3b3e2987fcac6d8054319a2a56..c0f3d0d6e08454d0d216e758ff5328ee4dee3151 100644 --- a/fluid/PaddleCV/image_classification/models/shufflenet_v2.py +++ b/fluid/PaddleCV/image_classification/models/shufflenet_v2.py @@ -1,14 +1,9 @@ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function import paddle.fluid as fluid from paddle.fluid.initializer import MSRA from paddle.fluid.param_attr import ParamAttr -__all__ = [ - 'ShuffleNetV2', 'ShuffleNetV2_x0_5', 'ShuffleNetV2_x1_0', - 'ShuffleNetV2_x1_5', 'ShuffleNetV2_x2_0' -] +__all__ = ['ShuffleNetV2', 'ShuffleNetV2_x0_5_swish', 'ShuffleNetV2_x1_0_swish', 'ShuffleNetV2_x1_5_swish', + 'ShuffleNetV2_x2_0_swish', 'ShuffleNetV2_x8_0_swish'] train_parameters = { "input_size": [3, 224, 224], @@ -29,82 +24,65 @@ class ShuffleNetV2(): self.scale = scale def net(self, input, class_dim=1000): - scale = self.scale + scale = self.scale stage_repeats = [4, 8, 4] - + if scale == 0.5: - stage_out_channels = [-1, 24, 48, 96, 192, 1024] + stage_out_channels = [-1, 24, 48, 96, 192, 1024] elif scale == 1.0: stage_out_channels = [-1, 24, 116, 232, 464, 1024] elif scale == 1.5: stage_out_channels = [-1, 24, 176, 352, 704, 1024] elif scale == 2.0: stage_out_channels = [-1, 24, 224, 488, 976, 2048] + elif scale == 8.0: + stage_out_channels = [-1, 48, 896, 1952, 3904, 8192] else: - raise ValueError("""{} groups is not supported for + raise ValueError( + """{} groups is not supported for 1x1 Grouped Convolutions""".format(num_groups)) #conv1 - + input_channel = stage_out_channels[1] - conv1 = self.conv_bn_layer( - input=input, - filter_size=3, - num_filters=input_channel, - padding=1, - stride=2) - pool1 = fluid.layers.pool2d( - input=conv1, - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') + conv1 = self.conv_bn_layer(input=input, filter_size=3, num_filters=input_channel, padding=1, stride=2,name='stage1_conv') + pool1 = fluid.layers.pool2d(input=conv1, pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') conv = pool1 # bottleneck sequences for idxstage in range(len(stage_repeats)): numrepeat = stage_repeats[idxstage] - output_channel = stage_out_channels[idxstage + 2] + output_channel = stage_out_channels[idxstage+2] for i in range(numrepeat): if i == 0: - conv = self.inverted_residual_unit( - input=conv, - num_filters=output_channel, - stride=2, - benchmodel=2) + conv = self.inverted_residual_unit(input=conv, num_filters=output_channel, stride=2, + benchmodel=2,name=str(idxstage+2)+'_'+str(i+1)) else: - conv = self.inverted_residual_unit( - input=conv, - num_filters=output_channel, - stride=1, - benchmodel=1) - - conv_last = self.conv_bn_layer( - input=conv, - filter_size=1, - num_filters=stage_out_channels[-1], - padding=0, - stride=1) - pool_last = fluid.layers.pool2d( - input=conv_last, - pool_size=7, - pool_stride=7, - pool_padding=0, - pool_type='avg') + conv = self.inverted_residual_unit(input=conv, num_filters=output_channel, stride=1, + benchmodel=1,name=str(idxstage+2)+'_'+str(i+1)) + + conv_last = self.conv_bn_layer(input=conv, filter_size=1, num_filters=stage_out_channels[-1], + padding=0, stride=1, name='conv5') + pool_last = fluid.layers.pool2d(input=conv_last, pool_size=7, pool_stride=1, pool_padding=0, pool_type='avg') + output = fluid.layers.fc(input=pool_last, size=class_dim, - param_attr=ParamAttr(initializer=MSRA())) + param_attr=ParamAttr(initializer=MSRA(),name='fc6_weights'), + bias_attr=ParamAttr(name='fc6_offset')) return output + def conv_bn_layer(self, - input, - filter_size, - num_filters, - stride, - padding, - num_groups=1, - use_cudnn=True, - if_act=True): + input, + filter_size, + num_filters, + stride, + padding, + num_groups=1, + use_cudnn=True, + if_act=True, + name=None): +# print(num_groups) conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -114,139 +92,164 @@ class ShuffleNetV2(): groups=num_groups, act=None, use_cudnn=use_cudnn, - param_attr=ParamAttr(initializer=MSRA()), + param_attr=ParamAttr(initializer=MSRA(),name=name+'_weights'), bias_attr=False) + out = int((input.shape[2] - 1)/float(stride) + 1) + # print(input.shape[1],(out, out), num_filters, (filter_size, filter_size), stride, + # (filter_size - 1) / 2, num_groups, name) + bn_name = name + '_bn' if if_act: - return fluid.layers.batch_norm(input=conv, act='relu') + return fluid.layers.batch_norm(input=conv, act='swish', + param_attr = ParamAttr(name=bn_name+"_scale"), + bias_attr=ParamAttr(name=bn_name+"_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') else: - return fluid.layers.batch_norm(input=conv) + return fluid.layers.batch_norm(input=conv, + param_attr = ParamAttr(name=bn_name+"_scale"), + bias_attr=ParamAttr(name=bn_name+"_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance') + def channel_shuffle(self, x, groups): - batchsize, num_channels, height, width = x.shape[0], x.shape[ - 1], x.shape[2], x.shape[3] + batchsize, num_channels, height, width = x.shape[0], x.shape[1], x.shape[2], x.shape[3] channels_per_group = num_channels // groups - + # reshape - x = fluid.layers.reshape( - x=x, shape=[batchsize, groups, channels_per_group, height, width]) + x = fluid.layers.reshape(x=x, shape=[batchsize, groups, channels_per_group, height, width]) - x = fluid.layers.transpose(x=x, perm=[0, 2, 1, 3, 4]) + x = fluid.layers.transpose(x=x, perm=[0,2,1,3,4]) # flatten - x = fluid.layers.reshape( - x=x, shape=[batchsize, num_channels, height, width]) + x = fluid.layers.reshape(x=x, shape=[batchsize, num_channels, height, width]) return x - def inverted_residual_unit(self, input, num_filters, stride, benchmodel): + + def inverted_residual_unit(self, input, num_filters, stride, benchmodel, name=None): assert stride in [1, 2], \ "supported stride are {} but your stride is {}".format([1,2], stride) - - oup_inc = num_filters // 2 + + oup_inc = num_filters//2 inp = input.shape[1] - + if benchmodel == 1: x1, x2 = fluid.layers.split( - input, - num_or_sections=[input.shape[1] // 2, input.shape[1] // 2], - dim=1) - + input, num_or_sections=[input.shape[1]//2, input.shape[1]//2], dim=1) +# x1 = input[:, :(input.shape[1]//2), :, :] +# x2 = input[:, (input.shape[1]//2):, :, :] + conv_pw = self.conv_bn_layer( - input=x2, - num_filters=oup_inc, - filter_size=1, + input=x2, + num_filters=oup_inc, + filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True) + if_act=True, + name='stage_'+name+'_conv1') conv_dw = self.conv_bn_layer( - input=conv_pw, - num_filters=oup_inc, - filter_size=3, - stride=stride, + input=conv_pw, + num_filters=oup_inc, + filter_size=3, + stride=stride, padding=1, - num_groups=oup_inc, - if_act=False) + num_groups=oup_inc, + if_act=False, + use_cudnn=False, + name='stage_'+name+'_conv2') conv_linear = self.conv_bn_layer( - input=conv_dw, - num_filters=oup_inc, - filter_size=1, - stride=1, + input=conv_dw, + num_filters=oup_inc, + filter_size=1, + stride=1, padding=0, - num_groups=1, - if_act=True) - + num_groups=1, + if_act=True, + name='stage_'+name+'_conv3') + out = fluid.layers.concat([x1, conv_linear], axis=1) + else: #branch1 - conv_dw = self.conv_bn_layer( - input=input, - num_filters=inp, - filter_size=3, + conv_dw_1 = self.conv_bn_layer( + input=input, + num_filters=inp, + filter_size=3, stride=stride, padding=1, num_groups=inp, - if_act=False) - + if_act=False, + use_cudnn=False, + name='stage_'+name+'_conv4') + conv_linear_1 = self.conv_bn_layer( - input=conv_dw, - num_filters=oup_inc, - filter_size=1, + input=conv_dw_1, + num_filters=oup_inc, + filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True) - + if_act=True, + name='stage_'+name+'_conv5') + #branch2 - conv_pw = self.conv_bn_layer( - input=input, - num_filters=oup_inc, - filter_size=1, + conv_pw_2 = self.conv_bn_layer( + input=input, + num_filters=oup_inc, + filter_size=1, stride=1, padding=0, num_groups=1, - if_act=True) - - conv_dw = self.conv_bn_layer( - input=conv_pw, - num_filters=oup_inc, - filter_size=3, - stride=stride, + if_act=True, + name='stage_'+name+'_conv1') + + conv_dw_2 = self.conv_bn_layer( + input=conv_pw_2, + num_filters=oup_inc, + filter_size=3, + stride=stride, padding=1, - num_groups=oup_inc, - if_act=False) + num_groups=oup_inc, + if_act=False, + use_cudnn=False, + name='stage_'+name+'_conv2') conv_linear_2 = self.conv_bn_layer( - input=conv_dw, - num_filters=oup_inc, - filter_size=1, - stride=1, + input=conv_dw_2, + num_filters=oup_inc, + filter_size=1, + stride=1, padding=0, - num_groups=1, - if_act=True) + num_groups=1, + if_act=True, + name='stage_'+name+'_conv3') out = fluid.layers.concat([conv_linear_1, conv_linear_2], axis=1) - + return self.channel_shuffle(out, 2) - - -def ShuffleNetV2_x0_5(): + +def ShuffleNetV2_x0_5_swish(): model = ShuffleNetV2(scale=0.5) return model - -def ShuffleNetV2_x1_0(): +def ShuffleNetV2_x1_0_swish(): model = ShuffleNetV2(scale=1.0) return model - -def ShuffleNetV2_x1_5(): +def ShuffleNetV2_x1_5_swish(): model = ShuffleNetV2(scale=1.5) return model - -def ShuffleNetV2_x2_0(): +def ShuffleNetV2_x2_0_swish(): model = ShuffleNetV2(scale=2.0) return model + +def ShuffleNetV2_x8_0_swish(): + model = ShuffleNetV2(scale=8.0) + return model + + diff --git a/fluid/PaddleCV/image_classification/models/vgg.py b/fluid/PaddleCV/image_classification/models/vgg.py index 7f559982334575c7c2bc778e1be8a4ebf69549fc..8fcd2d9f1c397a428685cfb7bd264f18c0d0a7e7 100644 --- a/fluid/PaddleCV/image_classification/models/vgg.py +++ b/fluid/PaddleCV/image_classification/models/vgg.py @@ -36,42 +36,37 @@ class VGGNet(): "supported layers are {} but input layer is {}".format(vgg_spec.keys(), layers) nums = vgg_spec[layers] - conv1 = self.conv_block(input, 64, nums[0]) - conv2 = self.conv_block(conv1, 128, nums[1]) - conv3 = self.conv_block(conv2, 256, nums[2]) - conv4 = self.conv_block(conv3, 512, nums[3]) - conv5 = self.conv_block(conv4, 512, nums[4]) + conv1 = self.conv_block(input, 64, nums[0], name="conv1_") + conv2 = self.conv_block(conv1, 128, nums[1], name="conv2_") + conv3 = self.conv_block(conv2, 256, nums[2], name="conv3_") + conv4 = self.conv_block(conv3, 512, nums[3], name="conv4_") + conv5 = self.conv_block(conv4, 512, nums[4], name="conv5_") fc_dim = 4096 + fc_name = ["fc6", "fc7", "fc8"] fc1 = fluid.layers.fc( input=conv5, size=fc_dim, act='relu', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Normal(scale=0.005)), - bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Constant(value=0.1))) + param_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_weights"), + bias_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_offset")) fc1 = fluid.layers.dropout(x=fc1, dropout_prob=0.5) fc2 = fluid.layers.fc( input=fc1, size=fc_dim, act='relu', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Normal(scale=0.005)), - bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Constant(value=0.1))) + param_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_weights"), + bias_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_offset")) fc2 = fluid.layers.dropout(x=fc2, dropout_prob=0.5) out = fluid.layers.fc( input=fc2, size=class_dim, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Normal(scale=0.005)), - bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Constant(value=0.1))) + param_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_weights"), + bias_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_offset")) return out - def conv_block(self, input, num_filter, groups): + def conv_block(self, input, num_filter, groups, name=None): conv = input for i in range(groups): conv = fluid.layers.conv2d( @@ -82,9 +77,9 @@ class VGGNet(): padding=1, act='relu', param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Normal(scale=0.01)), + name=name + str(i + 1) + "_weights"), bias_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Constant(value=0.0))) + name=name + str(i + 1) + "_offset")) return fluid.layers.pool2d( input=conv, pool_size=2, pool_type='max', pool_stride=2) diff --git a/fluid/PaddleCV/image_classification/models_name/googlenet.py b/fluid/PaddleCV/image_classification/models_name/googlenet.py deleted file mode 100644 index bd9040c53e61a48d9f5bff6683bec961d3f95583..0000000000000000000000000000000000000000 --- a/fluid/PaddleCV/image_classification/models_name/googlenet.py +++ /dev/null @@ -1,233 +0,0 @@ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -import paddle -import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr - -__all__ = ['GoogleNet'] - -train_parameters = { - "input_size": [3, 224, 224], - "input_mean": [0.485, 0.456, 0.406], - "input_std": [0.229, 0.224, 0.225], - "learning_strategy": { - "name": "piecewise_decay", - "batch_size": 256, - "epochs": [30, 70, 100], - "steps": [0.1, 0.01, 0.001, 0.0001] - } -} - - -class GoogleNet(): - def __init__(self): - self.params = train_parameters - - def conv_layer(self, - input, - num_filters, - filter_size, - stride=1, - groups=1, - act=None, - name=None): - channels = input.shape[1] - stdv = (3.0 / (filter_size**2 * channels))**0.5 - param_attr = ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=name + "_weights") - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=act, - param_attr=param_attr, - bias_attr=False, - name=name) - return conv - - def xavier(self, channels, filter_size, name): - stdv = (3.0 / (filter_size**2 * channels))**0.5 - param_attr = ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name=name + "_weights") - - return param_attr - - def inception(self, - input, - channels, - filter1, - filter3R, - filter3, - filter5R, - filter5, - proj, - name=None): - conv1 = self.conv_layer( - input=input, - num_filters=filter1, - filter_size=1, - stride=1, - act=None, - name="inception_" + name + "_1x1") - conv3r = self.conv_layer( - input=input, - num_filters=filter3R, - filter_size=1, - stride=1, - act=None, - name="inception_" + name + "_3x3_reduce") - conv3 = self.conv_layer( - input=conv3r, - num_filters=filter3, - filter_size=3, - stride=1, - act=None, - name="inception_" + name + "_3x3") - conv5r = self.conv_layer( - input=input, - num_filters=filter5R, - filter_size=1, - stride=1, - act=None, - name="inception_" + name + "_5x5_reduce") - conv5 = self.conv_layer( - input=conv5r, - num_filters=filter5, - filter_size=5, - stride=1, - act=None, - name="inception_" + name + "_5x5") - pool = fluid.layers.pool2d( - input=input, - pool_size=3, - pool_stride=1, - pool_padding=1, - pool_type='max') - convprj = fluid.layers.conv2d( - input=pool, - filter_size=1, - num_filters=proj, - stride=1, - padding=0, - name="inception_" + name + "_3x3_proj", - param_attr=ParamAttr( - name="inception_" + name + "_3x3_proj_weights"), - bias_attr=False) - cat = fluid.layers.concat(input=[conv1, conv3, conv5, convprj], axis=1) - cat = fluid.layers.relu(cat) - return cat - - def net(self, input, class_dim=1000): - conv = self.conv_layer( - input=input, - num_filters=64, - filter_size=7, - stride=2, - act=None, - name="conv1") - pool = fluid.layers.pool2d( - input=conv, pool_size=3, pool_type='max', pool_stride=2) - - conv = self.conv_layer( - input=pool, - num_filters=64, - filter_size=1, - stride=1, - act=None, - name="conv2_1x1") - conv = self.conv_layer( - input=conv, - num_filters=192, - filter_size=3, - stride=1, - act=None, - name="conv2_3x3") - pool = fluid.layers.pool2d( - input=conv, pool_size=3, pool_type='max', pool_stride=2) - - ince3a = self.inception(pool, 192, 64, 96, 128, 16, 32, 32, "ince3a") - ince3b = self.inception(ince3a, 256, 128, 128, 192, 32, 96, 64, - "ince3b") - pool3 = fluid.layers.pool2d( - input=ince3b, pool_size=3, pool_type='max', pool_stride=2) - - ince4a = self.inception(pool3, 480, 192, 96, 208, 16, 48, 64, "ince4a") - ince4b = self.inception(ince4a, 512, 160, 112, 224, 24, 64, 64, - "ince4b") - ince4c = self.inception(ince4b, 512, 128, 128, 256, 24, 64, 64, - "ince4c") - ince4d = self.inception(ince4c, 512, 112, 144, 288, 32, 64, 64, - "ince4d") - ince4e = self.inception(ince4d, 528, 256, 160, 320, 32, 128, 128, - "ince4e") - pool4 = fluid.layers.pool2d( - input=ince4e, pool_size=3, pool_type='max', pool_stride=2) - - ince5a = self.inception(pool4, 832, 256, 160, 320, 32, 128, 128, - "ince5a") - ince5b = self.inception(ince5a, 832, 384, 192, 384, 48, 128, 128, - "ince5b") - pool5 = fluid.layers.pool2d( - input=ince5b, pool_size=7, pool_type='avg', pool_stride=7) - dropout = fluid.layers.dropout(x=pool5, dropout_prob=0.4) - out = fluid.layers.fc(input=dropout, - size=class_dim, - act='softmax', - param_attr=self.xavier(1024, 1, "out"), - name="out", - bias_attr=ParamAttr(name="out_offset")) - - pool_o1 = fluid.layers.pool2d( - input=ince4a, pool_size=5, pool_type='avg', pool_stride=3) - conv_o1 = self.conv_layer( - input=pool_o1, - num_filters=128, - filter_size=1, - stride=1, - act=None, - name="conv_o1") - fc_o1 = fluid.layers.fc(input=conv_o1, - size=1024, - act='relu', - param_attr=self.xavier(2048, 1, "fc_o1"), - name="fc_o1", - bias_attr=ParamAttr(name="fc_o1_offset")) - dropout_o1 = fluid.layers.dropout(x=fc_o1, dropout_prob=0.7) - out1 = fluid.layers.fc(input=dropout_o1, - size=class_dim, - act='softmax', - param_attr=self.xavier(1024, 1, "out1"), - name="out1", - bias_attr=ParamAttr(name="out1_offset")) - - pool_o2 = fluid.layers.pool2d( - input=ince4d, pool_size=5, pool_type='avg', pool_stride=3) - conv_o2 = self.conv_layer( - input=pool_o2, - num_filters=128, - filter_size=1, - stride=1, - act=None, - name="conv_o2") - fc_o2 = fluid.layers.fc(input=conv_o2, - size=1024, - act='relu', - param_attr=self.xavier(2048, 1, "fc_o2"), - name="fc_o2", - bias_attr=ParamAttr(name="fc_o2_offset")) - dropout_o2 = fluid.layers.dropout(x=fc_o2, dropout_prob=0.7) - out2 = fluid.layers.fc(input=dropout_o2, - size=class_dim, - act='softmax', - param_attr=self.xavier(1024, 1, "out2"), - name="out2", - bias_attr=ParamAttr(name="out2_offset")) - - # last fc layer is "out" - return out, out1, out2 diff --git a/fluid/PaddleCV/image_classification/models_name/inception_v4.py b/fluid/PaddleCV/image_classification/models_name/inception_v4.py deleted file mode 100644 index 8c6c0dbb129f903b4f0b849f930a520b5f17e5db..0000000000000000000000000000000000000000 --- a/fluid/PaddleCV/image_classification/models_name/inception_v4.py +++ /dev/null @@ -1,340 +0,0 @@ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -import paddle -import paddle.fluid as fluid -import math -from paddle.fluid.param_attr import ParamAttr - -__all__ = ['InceptionV4'] - -train_parameters = { - "input_size": [3, 224, 224], - "input_mean": [0.485, 0.456, 0.406], - "input_std": [0.229, 0.224, 0.225], - "learning_strategy": { - "name": "piecewise_decay", - "batch_size": 256, - "epochs": [30, 60, 90], - "steps": [0.1, 0.01, 0.001, 0.0001] - } -} - - -class InceptionV4(): - def __init__(self): - self.params = train_parameters - - def net(self, input, class_dim=1000): - x = self.inception_stem(input) - - for i in range(4): - x = self.inceptionA(x, name=str(i + 1)) - x = self.reductionA(x) - - for i in range(7): - x = self.inceptionB(x, name=str(i + 1)) - x = self.reductionB(x) - - for i in range(3): - x = self.inceptionC(x, name=str(i + 1)) - - pool = fluid.layers.pool2d( - input=x, pool_size=8, pool_type='avg', global_pooling=True) - - drop = fluid.layers.dropout(x=pool, dropout_prob=0.2) - - stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0) - out = fluid.layers.fc( - input=drop, - size=class_dim, - param_attr=ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name="final_fc_weights"), - bias_attr=ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv), - name="final_fc_offset")) - return out - - def conv_bn_layer(self, - data, - num_filters, - filter_size, - stride=1, - padding=0, - groups=1, - act='relu', - name=None): - conv = fluid.layers.conv2d( - input=data, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=padding, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name) - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name, - param_attr=ParamAttr(name=bn_name + "_scale"), - bias_attr=ParamAttr(name=bn_name + "_offset"), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance') - - def inception_stem(self, data, name=None): - conv = self.conv_bn_layer( - data, 32, 3, stride=2, act='relu', name="conv1_3x3_s2") - conv = self.conv_bn_layer(conv, 32, 3, act='relu', name="conv2_3x3_s1") - conv = self.conv_bn_layer( - conv, 64, 3, padding=1, act='relu', name="conv3_3x3_s1") - - pool1 = fluid.layers.pool2d( - input=conv, pool_size=3, pool_stride=2, pool_type='max') - conv2 = self.conv_bn_layer( - conv, 96, 3, stride=2, act='relu', name="inception_stem1_3x3_s2") - concat = fluid.layers.concat([pool1, conv2], axis=1) - - conv1 = self.conv_bn_layer( - concat, 64, 1, act='relu', name="inception_stem2_3x3_reduce") - conv1 = self.conv_bn_layer( - conv1, 96, 3, act='relu', name="inception_stem2_3x3") - - conv2 = self.conv_bn_layer( - concat, 64, 1, act='relu', name="inception_stem2_1x7_reduce") - conv2 = self.conv_bn_layer( - conv2, - 64, (7, 1), - padding=(3, 0), - act='relu', - name="inception_stem2_1x7") - conv2 = self.conv_bn_layer( - conv2, - 64, (1, 7), - padding=(0, 3), - act='relu', - name="inception_stem2_7x1") - conv2 = self.conv_bn_layer( - conv2, 96, 3, act='relu', name="inception_stem2_3x3_2") - - concat = fluid.layers.concat([conv1, conv2], axis=1) - - conv1 = self.conv_bn_layer( - concat, 192, 3, stride=2, act='relu', name="inception_stem3_3x3_s2") - pool1 = fluid.layers.pool2d( - input=concat, pool_size=3, pool_stride=2, pool_type='max') - - concat = fluid.layers.concat([conv1, pool1], axis=1) - - return concat - - def inceptionA(self, data, name=None): - pool1 = fluid.layers.pool2d( - input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer( - pool1, 96, 1, act='relu', name="inception_a" + name + "_1x1") - - conv2 = self.conv_bn_layer( - data, 96, 1, act='relu', name="inception_a" + name + "_1x1_2") - - conv3 = self.conv_bn_layer( - data, 64, 1, act='relu', name="inception_a" + name + "_3x3_reduce") - conv3 = self.conv_bn_layer( - conv3, - 96, - 3, - padding=1, - act='relu', - name="inception_a" + name + "_3x3") - - conv4 = self.conv_bn_layer( - data, - 64, - 1, - act='relu', - name="inception_a" + name + "_3x3_2_reduce") - conv4 = self.conv_bn_layer( - conv4, - 96, - 3, - padding=1, - act='relu', - name="inception_a" + name + "_3x3_2") - conv4 = self.conv_bn_layer( - conv4, - 96, - 3, - padding=1, - act='relu', - name="inception_a" + name + "_3x3_3") - - concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) - - return concat - - def reductionA(self, data, name=None): - pool1 = fluid.layers.pool2d( - input=data, pool_size=3, pool_stride=2, pool_type='max') - - conv2 = self.conv_bn_layer( - data, 384, 3, stride=2, act='relu', name="reduction_a_3x3") - - conv3 = self.conv_bn_layer( - data, 192, 1, act='relu', name="reduction_a_3x3_2_reduce") - conv3 = self.conv_bn_layer( - conv3, 224, 3, padding=1, act='relu', name="reduction_a_3x3_2") - conv3 = self.conv_bn_layer( - conv3, 256, 3, stride=2, act='relu', name="reduction_a_3x3_3") - - concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) - - return concat - - def inceptionB(self, data, name=None): - pool1 = fluid.layers.pool2d( - input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer( - pool1, 128, 1, act='relu', name="inception_b" + name + "_1x1") - - conv2 = self.conv_bn_layer( - data, 384, 1, act='relu', name="inception_b" + name + "_1x1_2") - - conv3 = self.conv_bn_layer( - data, 192, 1, act='relu', name="inception_b" + name + "_1x7_reduce") - conv3 = self.conv_bn_layer( - conv3, - 224, (1, 7), - padding=(0, 3), - act='relu', - name="inception_b" + name + "_1x7") - conv3 = self.conv_bn_layer( - conv3, - 256, (7, 1), - padding=(3, 0), - act='relu', - name="inception_b" + name + "_7x1") - - conv4 = self.conv_bn_layer( - data, - 192, - 1, - act='relu', - name="inception_b" + name + "_7x1_2_reduce") - conv4 = self.conv_bn_layer( - conv4, - 192, (1, 7), - padding=(0, 3), - act='relu', - name="inception_b" + name + "_1x7_2") - conv4 = self.conv_bn_layer( - conv4, - 224, (7, 1), - padding=(3, 0), - act='relu', - name="inception_b" + name + "_7x1_2") - conv4 = self.conv_bn_layer( - conv4, - 224, (1, 7), - padding=(0, 3), - act='relu', - name="inception_b" + name + "_1x7_3") - conv4 = self.conv_bn_layer( - conv4, - 256, (7, 1), - padding=(3, 0), - act='relu', - name="inception_b" + name + "_7x1_3") - - concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1) - - return concat - - def reductionB(self, data, name=None): - pool1 = fluid.layers.pool2d( - input=data, pool_size=3, pool_stride=2, pool_type='max') - - conv2 = self.conv_bn_layer( - data, 192, 1, act='relu', name="reduction_b_3x3_reduce") - conv2 = self.conv_bn_layer( - conv2, 192, 3, stride=2, act='relu', name="reduction_b_3x3") - - conv3 = self.conv_bn_layer( - data, 256, 1, act='relu', name="reduction_b_1x7_reduce") - conv3 = self.conv_bn_layer( - conv3, - 256, (1, 7), - padding=(0, 3), - act='relu', - name="reduction_b_1x7") - conv3 = self.conv_bn_layer( - conv3, - 320, (7, 1), - padding=(3, 0), - act='relu', - name="reduction_b_7x1") - conv3 = self.conv_bn_layer( - conv3, 320, 3, stride=2, act='relu', name="reduction_b_3x3_2") - - concat = fluid.layers.concat([pool1, conv2, conv3], axis=1) - - return concat - - def inceptionC(self, data, name=None): - pool1 = fluid.layers.pool2d( - input=data, pool_size=3, pool_padding=1, pool_type='avg') - conv1 = self.conv_bn_layer( - pool1, 256, 1, act='relu', name="inception_c" + name + "_1x1") - - conv2 = self.conv_bn_layer( - data, 256, 1, act='relu', name="inception_c" + name + "_1x1_2") - - conv3 = self.conv_bn_layer( - data, 384, 1, act='relu', name="inception_c" + name + "_1x1_3") - conv3_1 = self.conv_bn_layer( - conv3, - 256, (1, 3), - padding=(0, 1), - act='relu', - name="inception_c" + name + "_1x3") - conv3_2 = self.conv_bn_layer( - conv3, - 256, (3, 1), - padding=(1, 0), - act='relu', - name="inception_c" + name + "_3x1") - - conv4 = self.conv_bn_layer( - data, 384, 1, act='relu', name="inception_c" + name + "_1x1_4") - conv4 = self.conv_bn_layer( - conv4, - 448, (1, 3), - padding=(0, 1), - act='relu', - name="inception_c" + name + "_1x3_2") - conv4 = self.conv_bn_layer( - conv4, - 512, (3, 1), - padding=(1, 0), - act='relu', - name="inception_c" + name + "_3x1_2") - conv4_1 = self.conv_bn_layer( - conv4, - 256, (1, 3), - padding=(0, 1), - act='relu', - name="inception_c" + name + "_1x3_3") - conv4_2 = self.conv_bn_layer( - conv4, - 256, (3, 1), - padding=(1, 0), - act='relu', - name="inception_c" + name + "_3x1_3") - - concat = fluid.layers.concat( - [conv1, conv2, conv3_1, conv3_2, conv4_1, conv4_2], axis=1) - - return concat diff --git a/fluid/PaddleCV/image_classification/reader.py b/fluid/PaddleCV/image_classification/reader.py index f79d87b0dc35db0d1a397daefc5e17e6c1a5f917..d8aa9da49b9e0caf28f72261965814c5cdca914d 100644 --- a/fluid/PaddleCV/image_classification/reader.py +++ b/fluid/PaddleCV/image_classification/reader.py @@ -156,13 +156,12 @@ def _reader_creator(file_list, for line in lines: if mode == 'train' or mode == 'val': img_path, label = line.split() - #img_path = img_path.replace("JPEG", "jpeg") img_path = os.path.join(data_dir, img_path) yield img_path, int(label) elif mode == 'test': img_path, label = line.split() - #img_path = img_path.replace("JPEG", "jpeg") img_path = os.path.join(data_dir, img_path) + yield [img_path] mapper = functools.partial( @@ -185,9 +184,11 @@ def train(data_dir=DATA_DIR, pass_id_as_seed=0): def val(data_dir=DATA_DIR): file_list = os.path.join(data_dir, 'val_list.txt') - return _reader_creator(file_list, 'val', shuffle=False, data_dir=data_dir) + return _reader_creator(file_list, 'val', shuffle=False, + data_dir=data_dir) def test(data_dir=DATA_DIR): file_list = os.path.join(data_dir, 'val_list.txt') - return _reader_creator(file_list, 'test', shuffle=False, data_dir=data_dir) + return _reader_creator(file_list, 'test', shuffle=False, + data_dir=data_dir) diff --git a/fluid/PaddleCV/image_classification/reader_cv2.py b/fluid/PaddleCV/image_classification/reader_cv2.py index dd9462c5c7076d57f7e137153a93534152cbb64b..7be5baa8014562d44371372318e4c8c81303c5fe 100644 --- a/fluid/PaddleCV/image_classification/reader_cv2.py +++ b/fluid/PaddleCV/image_classification/reader_cv2.py @@ -16,7 +16,6 @@ THREAD = 8 BUF_SIZE = 102400 DATA_DIR = 'data/ILSVRC2012' - img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1)) img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1)) @@ -40,8 +39,9 @@ def random_crop(img, size, scale=None, ratio=None): w = 1. * aspect_ratio h = 1. / aspect_ratio - bound = min((float(img.shape[1]) / img.shape[0]) / (w**2), - (float(img.shape[0]) / img.shape[1]) / (h**2)) + + bound = min((float(img.shape[0]) / img.shape[1]) / (w**2), + (float(img.shape[1]) / img.shape[0]) / (h**2)) scale_max = min(scale[1], bound) scale_min = min(scale[0], bound) @@ -50,15 +50,14 @@ def random_crop(img, size, scale=None, ratio=None): target_size = math.sqrt(target_area) w = int(target_size * w) h = int(target_size * h) + i = np.random.randint(0, img.shape[0] - w + 1) + j = np.random.randint(0, img.shape[1] - h + 1) - i = np.random.randint(0, img.size[0] - w + 1) - j = np.random.randint(0, img.size[1] - h + 1) + img = img[i:i + w, j:j + h, :] - img = img[i:i + h, j:j + w, :] - resized = cv2.resize(img, (size, size)) + resized = cv2.resize(img, (size, size), interpolation=cv2.INTER_LANCZOS4) return resized - def distort_color(img): return img @@ -68,7 +67,7 @@ def resize_short(img, target_size): percent = float(target_size) / min(img.shape[0], img.shape[1]) resized_width = int(round(img.shape[1] * percent)) resized_height = int(round(img.shape[0] * percent)) - resized = cv2.resize(img, (resized_width, resized_height)) + resized = cv2.resize(img, (resized_width, resized_height), interpolation=cv2.INTER_LANCZOS4) return resized @@ -140,16 +139,19 @@ def _reader_creator(file_list, shuffle=False, color_jitter=False, rotate=False, - data_dir=DATA_DIR): + data_dir=DATA_DIR, + pass_id_as_seed=0): def reader(): with open(file_list) as flist: full_lines = [line.strip() for line in flist] if shuffle: - np.random.shuffle(lines) + if pass_id_as_seed: + np.random.seed(pass_id_as_seed) + np.random.shuffle(full_lines) if mode == 'train' and os.getenv('PADDLE_TRAINING_ROLE'): # distributed mode if the env var `PADDLE_TRAINING_ROLE` exits trainer_id = int(os.getenv("PADDLE_TRAINER_ID", "0")) - trainer_count = int(os.getenv("PADDLE_TRAINERS", "1")) + trainer_count = int(os.getenv("PADDLE_TRAINERS_NUM", "1")) per_node_lines = len(full_lines) // trainer_count lines = full_lines[trainer_id * per_node_lines:(trainer_id + 1) * per_node_lines] @@ -159,6 +161,7 @@ def _reader_creator(file_list, len(full_lines))) else: lines = full_lines + for line in lines: if mode == 'train' or mode == 'val': img_path, label = line.split() @@ -166,21 +169,25 @@ def _reader_creator(file_list, img_path = os.path.join(data_dir, img_path) yield img_path, int(label) elif mode == 'test': - img_path = os.path.join(DATA_DIR, line) + img_path, label = line.split() + img_path = img_path.replace("JPEG", "jpeg") + img_path = os.path.join(data_dir, img_path) + yield [img_path] image_mapper = functools.partial( process_image, mode=mode, color_jitter=color_jitter, - rotate=color_jitter, + rotate=rotate, crop_size=224) reader = paddle.reader.xmap_readers( image_mapper, reader, THREAD, BUF_SIZE, order=False) return reader -def train(data_dir=DATA_DIR): +def train(data_dir=DATA_DIR, pass_id_as_seed=0): + file_list = os.path.join(data_dir, 'train_list.txt') return _reader_creator( file_list, @@ -188,14 +195,17 @@ def train(data_dir=DATA_DIR): shuffle=True, color_jitter=False, rotate=False, - data_dir=data_dir) + data_dir=data_dir, + pass_id_as_seed=pass_id_as_seed) def val(data_dir=DATA_DIR): file_list = os.path.join(data_dir, 'val_list.txt') - return _reader_creator(file_list, 'val', shuffle=False, data_dir=data_dir) + return _reader_creator(file_list, 'val', shuffle=False, + data_dir=data_dir) def test(data_dir=DATA_DIR): file_list = os.path.join(data_dir, 'val_list.txt') - return _reader_creator(file_list, 'test', shuffle=False, data_dir=data_dir) + return _reader_creator(file_list, 'test', shuffle=False, + data_dir=data_dir) diff --git a/fluid/PaddleCV/image_classification/run.sh b/fluid/PaddleCV/image_classification/run.sh index 741bc3a688a185425b9e3d843c7cc236c5b71b32..cc516a677771c2c22bdf702d0ae77916a1bd8f06 100755 --- a/fluid/PaddleCV/image_classification/run.sh +++ b/fluid/PaddleCV/image_classification/run.sh @@ -12,7 +12,6 @@ python train.py \ --lr=0.1 \ --num_epochs=200 \ --l2_decay=1.2e-4 \ - --model_category=models_name \ # >log_SE_ResNeXt50_32x4d.txt 2>&1 & #AlexNet: #python train.py \ @@ -23,7 +22,6 @@ python train.py \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ # --with_mem_opt=True \ -# --model_category=models_name \ # --lr_strategy=piecewise_decay \ # --num_epochs=120 \ # --lr=0.01 \ @@ -38,7 +36,6 @@ python train.py \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ # --with_mem_opt=True \ -# --model_category=models_name \ # --lr_strategy=piecewise_decay \ # --num_epochs=120 \ # --lr=0.1 \ @@ -51,12 +48,63 @@ python train.py \ # --class_dim=1000 \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ -# --model_category=models_name \ # --with_mem_opt=True \ # --lr_strategy=cosine_decay \ # --num_epochs=240 \ # --lr=0.1 \ # --l2_decay=4e-5 +#ResNet18: +#python train.py \ +# --model=ResNet18 \ +# --batch_size=256 \ +# --total_images=1281167 \ +# --class_dim=1000 \ +# --image_shape=3,224,224 \ +# --model_save_dir=output/ \ +# --with_mem_opt=True \ +# --lr_strategy=cosine_decay \ +# --lr=0.1 \ +# --num_epochs=120 \ +# --l2_decay=1e-4 +#ResNet34: +#python train.py \ +# --model=ResNet34 \ +# --batch_size=256 \ +# --total_images=1281167 \ +# --class_dim=1000 \ +# --image_shape=3,224,224 \ +# --model_save_dir=output/ \ +# --with_mem_opt=True \ +# --lr_strategy=cosine_decay \ +# --lr=0.1 \ +# --num_epochs=120 \ +# --l2_decay=1e-4 +#ShuffleNetv2: +#python train.py \ +# --model=ShuffleNetV2 \ +# --batch_size=1024 \ +# --total_images=1281167 \ +# --class_dim=1000 \ +# --image_shape=3,224,224 \ +# --model_save_dir=output/ \ +# --with_mem_opt=True \ +# --lr_strategy=cosine_decay_with_warmup \ +# --lr=0.5 \ +# --num_epochs=240 \ +# --l2_decay=4e-5 +#GoogleNet: +#python train.py \ +# --model=GoogleNet \ +# --batch_size=256 \ +# --total_images=1281167 \ +# --class_dim=1000 \ +# --image_shape=3,224,224 \ +# --model_save_dir=output/ \ +# --with_mem_opt=True \ +# --lr_strategy=cosine_decay \ +# --lr=0.01 \ +# --num_epochs=200 \ +# --l2_decay=1e-4 #ResNet50: #python train.py \ # --model=ResNet50 \ @@ -66,7 +114,6 @@ python train.py \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ # --with_mem_opt=True \ -# --model_category=models_name \ # --lr_strategy=piecewise_decay \ # --num_epochs=120 \ # --lr=0.1 \ @@ -80,7 +127,6 @@ python train.py \ # --class_dim=1000 \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ -# --model_category=models_name \ # --with_mem_opt=True \ # --lr_strategy=piecewise_decay \ # --num_epochs=120 \ @@ -96,7 +142,6 @@ python train.py \ # --image_shape=3,224,224 \ # --model_save_dir=output/ \ # --lr_strategy=piecewise_decay \ -# --model_category=models_name \ # --with_mem_opt=True \ # --lr=0.1 \ # --num_epochs=120 \ @@ -111,7 +156,6 @@ python train.py \ # --class_dim=1000 \ # --image_shape=3,224,224 \ # --lr_strategy=cosine_decay \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --lr=0.1 \ # --num_epochs=200 \ @@ -126,7 +170,6 @@ python train.py \ # --class_dim=1000 \ # --image_shape=3,224,224 \ # --lr_strategy=cosine_decay \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --lr=0.1 \ # --num_epochs=200 \ @@ -141,7 +184,6 @@ python train.py \ # --image_shape=3,224,224 \ # --lr_strategy=cosine_decay \ # --class_dim=1000 \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --lr=0.1 \ # --num_epochs=90 \ @@ -158,7 +200,6 @@ python train.py \ # --lr_strategy=cosine_decay \ # --lr=0.01 \ # --num_epochs=90 \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --with_mem_opt=True \ # --l2_decay=3e-4 @@ -171,7 +212,6 @@ python train.py \ # --class_dim=1000 \ # --lr_strategy=cosine_decay \ # --image_shape=3,224,224 \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --lr=0.01 \ # --num_epochs=90 \ @@ -189,7 +229,6 @@ python train.py \ # --lr=0.01 \ # --num_epochs=90 \ # --with_mem_opt=True \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --l2_decay=3e-4 @@ -204,7 +243,6 @@ python train.py \ # --lr=0.001 \ # --num_epochs=120 \ # --with_mem_opt=False \ -# --model_category=models_name \ # --model_save_dir=output/ \ # --lr_strategy=adam \ # --use_gpu=False diff --git a/fluid/PaddleCV/image_classification/train.py b/fluid/PaddleCV/image_classification/train.py index adf7febc9ffe2b089fc2c5dd457c6469d5b3cb43..b18fdd4693cd3af4bf2ad6bb6f595a789d8aaeb9 100644 --- a/fluid/PaddleCV/image_classification/train.py +++ b/fluid/PaddleCV/image_classification/train.py @@ -10,14 +10,15 @@ import math import paddle import paddle.fluid as fluid import paddle.dataset.flowers as flowers -import reader +import reader as reader import argparse import functools import subprocess import utils -from utils.learning_rate import cosine_decay +import models from utils.fp16_utils import create_master_params_grads, master_param_to_train_param -from utility import add_arguments, print_arguments +from utils.utility import add_arguments, print_arguments +from utils.learning_rate import cosine_decay_with_warmup IMAGENET1000 = 1281167 @@ -44,19 +45,6 @@ add_arg('fp16', bool, False, "Enable half precision add_arg('scale_loss', float, 1.0, "Scale loss for fp16." ) add_arg('l2_decay', float, 1e-4, "L2_decay parameter.") add_arg('momentum_rate', float, 0.9, "momentum_rate.") -# yapf: enable - - -def set_models(model_category): - global models - assert model_category in ["models", "models_name" - ], "{} is not in lists: {}".format( - model_category, ["models", "models_name"]) - if model_category == "models_name": - import models_name as models - else: - import models as models - def optimizer_setting(params): ls = params["learning_strategy"] @@ -68,8 +56,7 @@ def optimizer_setting(params): else: total_images = params["total_images"] batch_size = ls["batch_size"] - step = int(total_images / batch_size + 1) - + step = int(math.ceil(float(total_images) / batch_size)) bd = [step * e for e in ls["epochs"]] base_lr = params["lr"] lr = [] @@ -88,16 +75,34 @@ def optimizer_setting(params): batch_size = ls["batch_size"] l2_decay = params["l2_decay"] momentum_rate = params["momentum_rate"] - step = int(total_images / batch_size + 1) + step = int(math.ceil(float(total_images) / batch_size)) + lr = params["lr"] + num_epochs = params["num_epochs"] + optimizer = fluid.optimizer.Momentum( + learning_rate=fluid.layers.cosine_decay( + learning_rate=lr, step_each_epoch=step, epochs=num_epochs), + momentum=momentum_rate, + regularization=fluid.regularizer.L2Decay(l2_decay)) + + elif ls["name"] == "cosine_warmup_decay": + if "total_images" not in params: + total_images = IMAGENET1000 + else: + total_images = params["total_images"] + batch_size = ls["batch_size"] + l2_decay = params["l2_decay"] + momentum_rate = params["momentum_rate"] + step = int(math.ceil(float(total_images) / batch_size)) lr = params["lr"] num_epochs = params["num_epochs"] optimizer = fluid.optimizer.Momentum( - learning_rate=cosine_decay( + learning_rate=cosine_decay_with_warmup( learning_rate=lr, step_each_epoch=step, epochs=num_epochs), momentum=momentum_rate, regularization=fluid.regularizer.L2Decay(l2_decay)) + elif ls["name"] == "linear_decay": if "total_images" not in params: total_images = IMAGENET1000 @@ -119,6 +124,25 @@ def optimizer_setting(params): elif ls["name"] == "adam": lr = params["lr"] optimizer = fluid.optimizer.Adam(learning_rate=lr) + elif ls["name"] == "rmsprop_cosine": + if "total_images" not in params: + total_images = IMAGENET1000 + else: + total_images = params["total_images"] + batch_size = ls["batch_size"] + l2_decay = params["l2_decay"] + momentum_rate = params["momentum_rate"] + step = int(math.ceil(float(total_images) / batch_size)) + lr = params["lr"] + num_epochs = params["num_epochs"] + optimizer = fluid.optimizer.RMSProp( + learning_rate=fluid.layers.cosine_decay( + learning_rate=lr, step_each_epoch=step, epochs=num_epochs), + momentum=momentum_rate, + regularization=fluid.regularizer.L2Decay(l2_decay), + # RMSProp Optimizer: Apply epsilon=1 on ImageNet. + epsilon=1 + ) else: lr = params["lr"] l2_decay = params["l2_decay"] @@ -130,7 +154,6 @@ def optimizer_setting(params): return optimizer - def net_config(image, label, model, args): model_list = [m for m in dir(models) if "__" not in m] assert args.model in model_list, "{} is not lists: {}".format(args.model, @@ -220,6 +243,13 @@ def build_program(is_train, main_prog, startup_prog, args): else: return py_reader, avg_cost, acc_top1, acc_top5 +def get_device_num(): + visible_device = os.getenv('CUDA_VISIBLE_DEVICES') + if visible_device: + device_num = len(visible_device.split(',')) + else: + device_num = subprocess.check_output(['nvidia-smi','-L']).decode().count('\n') + return device_num def train(args): # parameters from arguments @@ -268,12 +298,7 @@ def train(args): exe, pretrained_model, main_program=train_prog, predicate=if_exist) if args.use_gpu: - visible_device = os.getenv('CUDA_VISIBLE_DEVICES') - if visible_device: - device_num = len(visible_device.split(',')) - else: - device_num = subprocess.check_output( - ['nvidia-smi', '-L']).decode().count('\n') + device_num = get_device_num() else: device_num = 1 train_batch_size = args.batch_size / device_num @@ -345,8 +370,8 @@ def train(args): if batch_id % 10 == 0: print("Pass {0}, trainbatch {1}, loss {2}, \ - acc1 {3}, acc5 {4}, lr{5}, time {6}" - .format(pass_id, batch_id, loss, acc1, acc5, "%.5f" % + acc1 {3}, acc5 {4}, lr {5}, time {6}" + .format(pass_id, batch_id, "%.5f"%loss, "%.5f"%acc1, "%.5f"%acc5, "%.5f" % lr, "%2.2f sec" % period)) sys.stdout.flush() batch_id += 1 @@ -378,7 +403,7 @@ def train(args): if test_batch_id % 10 == 0: print("Pass {0},testbatch {1},loss {2}, \ acc1 {3},acc5 {4},time {5}" - .format(pass_id, test_batch_id, loss, acc1, acc5, + .format(pass_id, test_batch_id, "%.5f"%loss,"%.5f"%acc1, "%.5f"%acc5, "%2.2f sec" % period)) sys.stdout.flush() test_batch_id += 1 @@ -391,8 +416,8 @@ def train(args): print("End pass {0}, train_loss {1}, train_acc1 {2}, train_acc5 {3}, " "test_loss {4}, test_acc1 {5}, test_acc5 {6}".format( - pass_id, train_loss, train_acc1, train_acc5, test_loss, - test_acc1, test_acc5)) + pass_id, "%.5f"%train_loss, "%.5f"%train_acc1, "%.5f"%train_acc5, "%.5f"%test_loss, + "%.5f"%test_acc1, "%.5f"%test_acc5)) sys.stdout.flush() model_path = os.path.join(model_save_dir + '/' + model_name, @@ -429,7 +454,6 @@ def train(args): def main(): args = parser.parse_args() - set_models(args.model_category) print_arguments(args) train(args) diff --git a/fluid/PaddleCV/image_classification/utils/__init__.py b/fluid/PaddleCV/image_classification/utils/__init__.py index 4751caceeb14f0dddc937d90b4c953a870ffc3f8..1e025483d26b01c32ccf13127c5f1c5078737a17 100644 --- a/fluid/PaddleCV/image_classification/utils/__init__.py +++ b/fluid/PaddleCV/image_classification/utils/__init__.py @@ -1,2 +1,3 @@ from .learning_rate import cosine_decay, lr_warmup from .fp16_utils import create_master_params_grads, master_param_to_train_param +from .utility import add_arguments, print_arguments diff --git a/fluid/PaddleCV/image_classification/utils/learning_rate.py b/fluid/PaddleCV/image_classification/utils/learning_rate.py index 15f7f6e52073a8a46c7dc3ce8b7dbda9f58c2019..d52d8ebebeeb93e39c1e01be9e6789a682279490 100644 --- a/fluid/PaddleCV/image_classification/utils/learning_rate.py +++ b/fluid/PaddleCV/image_classification/utils/learning_rate.py @@ -21,6 +21,33 @@ def cosine_decay(learning_rate, step_each_epoch, epochs=120): (ops.cos(epoch * (math.pi / epochs)) + 1)/2 return decayed_lr +def cosine_decay_with_warmup(learning_rate, step_each_epoch, epochs=120): + """Applies cosine decay to the learning rate. + lr = 0.05 * (math.cos(epoch * (math.pi / 120)) + 1) + decrease lr for every mini-batch and start with warmup. + """ + global_step = _decay_step_counter() + lr = fluid.layers.tensor.create_global_var( + shape=[1], + value=0.0, + dtype='float32', + persistable=True, + name="learning_rate") + + warmup_epoch = fluid.layers.fill_constant( + shape=[1], dtype='float32', value=float(5), force_cpu=True) + + with init_on_cpu(): + epoch = ops.floor(global_step / step_each_epoch) + with control_flow.Switch() as switch: + with switch.case(epoch < warmup_epoch): + decayed_lr = learning_rate * (global_step / (step_each_epoch * warmup_epoch)) + fluid.layers.tensor.assign(input=decayed_lr, output=lr) + with switch.default(): + decayed_lr = learning_rate * \ + (ops.cos((global_step - warmup_epoch * step_each_epoch) * (math.pi / (epochs * step_each_epoch))) + 1)/2 + fluid.layers.tensor.assign(input=decayed_lr, output=lr) + return lr def lr_warmup(learning_rate, warmup_steps, start_lr, end_lr): """ Applies linear learning rate warmup for distributed training diff --git a/fluid/PaddleCV/image_classification/utility.py b/fluid/PaddleCV/image_classification/utils/utility.py similarity index 91% rename from fluid/PaddleCV/image_classification/utility.py rename to fluid/PaddleCV/image_classification/utils/utility.py index 5b10a179ac2231cb26ab42993b7300d5e99f44bc..c28646da24cb0fb42a91a2cbff92c20db307da81 100644 --- a/fluid/PaddleCV/image_classification/utility.py +++ b/fluid/PaddleCV/image_classification/utils/utility.py @@ -37,10 +37,10 @@ def print_arguments(args): :param args: Input argparse.Namespace for printing. :type args: argparse.Namespace """ - print("----------- Configuration Arguments -----------") + print("------------- Configuration Arguments -------------") for arg, value in sorted(six.iteritems(vars(args))): - print("%s: %s" % (arg, value)) - print("------------------------------------------------") + print("%25s : %s" % (arg, value)) + print("----------------------------------------------------") def add_arguments(argname, type, default, help, argparser, **kwargs):