diff --git a/docs/zh_CN/training/semi_supervised_learning/FixMatch.md b/docs/zh_CN/training/semi_supervised_learning/FixMatch.md new file mode 100644 index 0000000000000000000000000000000000000000..73dad00b952b37c338d3094c5926dd623efa907b --- /dev/null +++ b/docs/zh_CN/training/semi_supervised_learning/FixMatch.md @@ -0,0 +1,206 @@ +**简体中文 | English(TODO)** + +# FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence + +**论文出处:**[https://arxiv.org/abs/2001.07685](https://arxiv.org/abs/2001.07685) + +## 目录 + +* [1. 原理介绍](#1-%E5%8E%9F%E7%90%86%E4%BB%8B%E7%BB%8D) +* [2. 精度指标](#2-%E7%B2%BE%E5%BA%A6%E6%8C%87%E6%A0%87) +* [3. 数据准备](#3-%E6%95%B0%E6%8D%AE%E5%87%86%E5%A4%87) +* [4. 模型训练](#4-%E6%A8%A1%E5%9E%8B%E8%AE%AD%E7%BB%83) +* [5. 模型评估与推理部署](#5-%E6%A8%A1%E5%9E%8B%E8%AF%84%E4%BC%B0%E4%B8%8E%E6%8E%A8%E7%90%86%E9%83%A8%E7%BD%B2) +* [5.1 模型评估](#51-%E6%A8%A1%E5%9E%8B%E8%AF%84%E4%BC%B0) +* [5.2 模型推理](#52-%E6%A8%A1%E5%9E%8B%E6%8E%A8%E7%90%86) + * [5.2.1 推理模型准备](#521-%E6%8E%A8%E7%90%86%E6%A8%A1%E5%9E%8B%E5%87%86%E5%A4%87) + * [5.2.2 基于 Python 预测引擎推理](#522-%E5%9F%BA%E4%BA%8E-python-%E9%A2%84%E6%B5%8B%E5%BC%95%E6%93%8E%E6%8E%A8%E7%90%86) + * [5.2.3 基于 C++ 预测引擎推理](#523-%E5%9F%BA%E4%BA%8E-c-%E9%A2%84%E6%B5%8B%E5%BC%95%E6%93%8E%E6%8E%A8%E7%90%86) +* [5.4 服务化部署](#54-%E6%9C%8D%E5%8A%A1%E5%8C%96%E9%83%A8%E7%BD%B2) +* [5.5 端侧部署](#55-%E7%AB%AF%E4%BE%A7%E9%83%A8%E7%BD%B2) +* [5.6 Paddle2ONNX 模型转换与预测](#56-paddle2onnx-%E6%A8%A1%E5%9E%8B%E8%BD%AC%E6%8D%A2%E4%B8%8E%E9%A2%84%E6%B5%8B) +* [6. 参考资料](#6-%E5%8F%82%E8%80%83%E8%B5%84%E6%96%99) + +## 1. 原理介绍 + +**作者提出一种简单而有效的半监督学习方法。主要是在有标签的数据训练的同时,对无标签的数据进行强弱两种不同的数据增强。如果无标签的数据弱数据增强的分类结果,大于阈值,则弱数据增强的输出标签作为软标签,对强数据增强的输出进行loss计算及模型训练。如示例图所示。** + +![](https://raw.githubusercontent.com/google-research/fixmatch/master/media/FixMatch%20diagram.png) + +## 2. 精度指标 + +**以下表格总结了复现的 FixMatch在 Cifar10 数据集上的精度指标。** + +| **Labels** | **40** | **250** | **4000** | +| ---------------------------- | ----------------------- | ----------------------- | ----------------------- | +| **Paper (tensorflow)** | **86.19 ± 3.37** | **94.93 ± 0.65** | **95.74 ± 0.05** | +| **pytorch版本** | **93.60** | **95.31** | **95.77** | +| **paddle版本** | **93.14** | **95.37** | **95.89** | + +**cifar10上,paddle版本配置文件及训练好的模型如下表所示** + +| **label** | **配置文件地址** | **模型下载链接** | +| --------------- | -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | +| **40** | [配置文件](../../../../ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml) | [模型地址](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/semi_superwised_learning/FixMatch_WideResNet_cifar10_label40.pdparams) | +| **250** | [配置文件](../../../../ppcls/configs/ssl/FixMatch/FixMatch_cifar10_250.yaml) | [模型地址](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/semi_superwised_learning/FixMatch_WideResNet_cifar10_label250.pdparams) | +| **4000** | [配置文件](../../../../ppcls/configs/ssl/FixMatch/FixMatch_cifar10_4000.yaml) | [模型地址](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/semi_superwised_learning/FixMatch_WideResNet_cifar10_label4000.pdparams) | + +**接下来主要以** `FixMatch/FixMatch_cifar10_40.yaml`配置和训练好的模型文件为例,展示在cifar10数据集上进行训练、测试、推理的过程。 + +## 3. 数据准备 + +在训练及测试的过程中,cifar10数据集会自动下载,请保持联网。如网络问题,则提前下载好[相关数据](https://dataset.bj.bcebos.com/cifar/cifar-10-python.tar.gz),并在以下命令中,添加如下参数 + +``` +${cmd} -o DataLoader.Train.dataset.data_file=${data_file} -o DataLoader.UnLabelTrain.dataset.data_file=${data_file} -o DataLoader.Eval.dataset.data_file=${data_file} +``` + +**其中:**`${cmd}`为以下的命令,`${data_file}`是下载数据的路径。如4.1中单卡命令就改为: + +```shell +python tools/train.py -c ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml -o DataLoader.Train.dataset.data_file=cifar-10-python.tar.gz -o DataLoader.UnLabelTrain.dataset.data_file=cifar-10-python.tar.gz -o DataLoader.Eval.dataset.data_file=cifar-10-python.tar.gz +``` + +## 4. 模型训练 + +1. **执行以下命令开始训练** + **单卡训练:** + + ``` + python tools/train.py -c ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml + ``` + + **注:单卡训练大约需要2-4个天。** +2. **查看训练日志和保存的模型参数文件** + 训练过程中会在屏幕上实时打印loss等指标信息,同时会保存日志文件`train.log`、模型参数文件 `*.pdparams`、优化器参数文件 `*.pdopt`等内容到 `Global.output_dir`指定的文件夹下,默认在 `PaddleClas/output/WideResNet/`文件夹下。 + +## 5. 模型评估与推理部署 + +### 5.1 模型评估 + +准备用于评估的 `*.pdparams`模型参数文件,可以使用训练好的模型,也可以使用[4. 模型训练](#4-%E6%A8%A1%E5%9E%8B%E8%AE%AD%E7%BB%83)中保存的模型。 + +* 以训练过程中保存的`best_model_ema.ema.pdparams`为例,执行如下命令即可进行评估。 + + ``` + python3.7 tools/eval.py \ + -c ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml \ + -o Global.pretrained_model="./output/WideResNet/best_model_ema.ema" + ``` +* 以训练好的模型为例,下载提供的已经训练好的模型,到`PaddleClas/pretrained_models` 文件夹中,执行如下命令即可进行评估。 + + ``` + # 下载模型 + cd PaddleClas + mkdir pretrained_models + cd pretrained_models + wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/semi_superwised_learning/FixMatch_WideResNet_cifar10_label40.pdparams + cd .. + # 评估 + python3.7 tools/eval.py \ + -c ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml \ + -o Global.pretrained_model="pretrained_models/FixMatch_WideResNet_cifar10_label40" + ``` + + **注:**`pretrained_model` 后填入的地址不需要加 `.pdparams` 后缀,在程序运行时会自动补上。 +* 查看输出结果 + + ``` + ... + ... + CELoss: 0.58960, loss: 0.58960, top1: 0.95312, top5: 0.98438, batch_cost: 3.00355s, reader_cost: 1.09548, ips: 21.30810 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 20/157]CELoss: 0.14618, loss: 0.14618, top1: 0.93601, top5: 0.99628, batch_cost: 0.02379s, reader_cost: 0.00016, ips: 2690.05243 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 40/157]CELoss: 0.01801, loss: 0.01801, top1: 0.93216, top5: 0.99505, batch_cost: 0.02716s, reader_cost: 0.00015, ips: 2356.48846 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 60/157]CELoss: 0.63351, loss: 0.63351, top1: 0.92982, top5: 0.99539, batch_cost: 0.02585s, reader_cost: 0.00015, ips: 2475.86506 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 80/157]CELoss: 0.85084, loss: 0.85084, top1: 0.93191, top5: 0.99576, batch_cost: 0.02578s, reader_cost: 0.00015, ips: 2482.59021 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 100/157]CELoss: 0.04171, loss: 0.04171, top1: 0.93147, top5: 0.99567, batch_cost: 0.02676s, reader_cost: 0.00015, ips: 2391.99053 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 120/157]CELoss: 0.89842, loss: 0.89842, top1: 0.93027, top5: 0.99561, batch_cost: 0.02647s, reader_cost: 0.00015, ips: 2418.24635 images/sec + ppcls INFO: [Eval][Epoch 0][Iter: 140/157]CELoss: 0.57866, loss: 0.57866, top1: 0.93107, top5: 0.99568, batch_cost: 0.02678s, reader_cost: 0.00015, ips: 2389.46068 images/sec + ppcls INFO: [Eval][Epoch 0][Avg]CELoss: 0.59721, loss: 0.59721, top1: 0.93140, top5: 0.99570 + ``` + + 默认评估日志保存在`PaddleClas/output/WideResNet/eval.log`中,可以看到我们提供的模型在 cifar10 数据集上的评估指标为top1: 0.93140, top5: 0.99570 + +### 5.2 模型推理 + +#### 5.2.1 推理模型准备 + +将训练过程中保存的模型文件转换成 inference 模型,同样以`best_model_ema.ema.pdparams` 为例,执行以下命令进行转换 + +``` +python3.7 tools/export_model.py \ +-c ppcls/configs/ssl/FixMatch_cifar10_40.yaml \ +-o -o Global.pretrained_model=output/WideResNet/best_model_ema.ema \ +-o Global.save_inference_dir="./deploy/inference" +``` + +#### 5.2.2 基于 Python 预测引擎推理 + +1. 修改`PaddleClas/deploy/configs/inference_cls.yaml` + + - 将`infer_imgs:` 后的路径段改为 query 文件夹下的任意一张图片路径(下方配置使用的是 `demo.jpg`图片的路径) + - 将`rec_inference_model_dir:` 后的字段改为解压出来的 inference模型文件夹路径 + - 将`transform_ops:` 字段下的预处理配置改为 `FixMatch_cifar10_40.yaml` 中 `Eval.dataset` 下的预处理配置 + + ``` + Global: + infer_imgs: "demo" + rec_inference_model_dir: "./inferece" + batch_size: 1 + use_gpu: False + enable_mkldnn: True + cpu_num_threads: 10 + enable_benchmark: False + use_fp16: False + ir_optim: True + use_tensorrt: False + gpu_mem: 8000 + enable_profile: False + + RecPreProcess: + transform_ops: +   -  NormalizeImage: +       scale: 1.0/255.0 +       mean: [0.4914, 0.4822, 0.4465] +       std: [0.2471, 0.2435, 0.2616] +       order: hwc + PostProcess: null + ``` +2. 执行推理命令 + + ``` + cd ./deploy/ + python3.7 python/predict_rec.py -c ./configs/inference_rec.yaml + ``` +3. 查看输出结果,实际结果为一个长度10的向量,表示图像分类的结果,如 + + ``` + demo.JPG:       [ 0.02560742 0.05221584 ... 0.11635944 -0.18817757 + 0.07170864] + ``` + +#### 5.2.3 基于 C++ 预测引擎推理 + +PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../../deployment/image_classification/cpp/linux.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考基于 Visual Studio 2019 Community CMake 编译指南完成相应的预测库编译和模型预测工作。 + +### 5.4 服务化部署 + +Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考Paddle Serving 代码仓库。 + +PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../../deployment/PP-ShiTu/paddle_serving.md)来完成相应的部署工作。 + +### 5.5 端侧部署 + +Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考Paddle Lite 代码仓库。 + +PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../../deployment/image_classification/paddle_lite.md)来完成相应的部署工作。 + +### 5.6 Paddle2ONNX 模型转换与预测 + +Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考Paddle2ONNX 代码仓库。 + +PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考**[Paddle2ONNX 模型转换与预测](../../deployment/image_classification/paddle2onnx.md)来完成相应的部署工作。 + +### 6. 参考资料 + +1. [FixMatch](https://arxiv.org/abs/2001.07685) diff --git a/ppcls/arch/backbone/__init__.py b/ppcls/arch/backbone/__init__.py index 1c5aa2442b360d8a176c4a53cb20ef7c521ceb56..6ddaa88292037a5b3aaa7228be1c0933d9444f0b 100644 --- a/ppcls/arch/backbone/__init__.py +++ b/ppcls/arch/backbone/__init__.py @@ -77,6 +77,7 @@ from .variant_models.vgg_variant import VGG19Sigmoid from .variant_models.pp_lcnet_variant import PPLCNet_x2_5_Tanh from .variant_models.pp_lcnetv2_variant import PPLCNetV2_base_ShiTu from .model_zoo.adaface_ir_net import AdaFace_IR_18, AdaFace_IR_34, AdaFace_IR_50, AdaFace_IR_101, AdaFace_IR_152, AdaFace_IR_SE_50, AdaFace_IR_SE_101, AdaFace_IR_SE_152, AdaFace_IR_SE_200 +from .model_zoo.wideresnet import WideResNet # help whl get all the models' api (class type) and components' api (func type) diff --git a/ppcls/arch/backbone/model_zoo/wideresnet.py b/ppcls/arch/backbone/model_zoo/wideresnet.py new file mode 100644 index 0000000000000000000000000000000000000000..ac5b4c47718e4b086685ab7910850ac971c9d5ab --- /dev/null +++ b/ppcls/arch/backbone/model_zoo/wideresnet.py @@ -0,0 +1,238 @@ +import paddle +import paddle.nn as nn +import paddle.nn.functional as F +from paddle import ParamAttr +""" +backbone option "WideResNet" +code in this file is adpated from +https://github.com/kekmodel/FixMatch-pytorch/blob/master/models/wideresnet.py +thanks! +""" + + +def mish(x): + """Mish: A Self Regularized Non-Monotonic Neural Activation Function (https://arxiv.org/abs/1908.08681)""" + return x * paddle.tanh(F.softplus(x)) + + +class PSBatchNorm2D(nn.BatchNorm2D): + """How Does BN Increase Collapsed Neural Network Filters? (https://arxiv.org/abs/2001.11216)""" + + def __init__(self, + num_features, + alpha=0.1, + eps=1e-05, + momentum=0.999, + weight_attr=None, + bias_attr=None): + super().__init__(num_features, momentum, eps, weight_attr, bias_attr) + self.alpha = alpha + + def forward(self, x): + return super().forward(x) + self.alpha + + +class BasicBlock(nn.Layer): + def __init__(self, + in_planes, + out_planes, + stride, + drop_rate=0.0, + activate_before_residual=False): + super(BasicBlock, self).__init__() + self.bn1 = nn.BatchNorm2D(in_planes, momentum=0.999) + self.relu1 = nn.LeakyReLU(negative_slope=0.1) + self.conv1 = nn.Conv2D( + in_planes, + out_planes, + kernel_size=3, + stride=stride, + padding=1, + bias_attr=False) + self.bn2 = nn.BatchNorm2D(out_planes, momentum=0.999) + self.relu2 = nn.LeakyReLU(negative_slope=0.1) + self.conv2 = nn.Conv2D( + out_planes, + out_planes, + kernel_size=3, + stride=1, + padding=1, + bias_attr=False) + self.drop_rate = drop_rate + self.equalInOut = (in_planes == out_planes) + self.convShortcut = (not self.equalInOut) and nn.Conv2D( + in_planes, + out_planes, + kernel_size=1, + stride=stride, + padding=0, + bias_attr=False) or None + self.activate_before_residual = activate_before_residual + + def forward(self, x): + if not self.equalInOut and self.activate_before_residual == True: + x = self.relu1(self.bn1(x)) + else: + out = self.relu1(self.bn1(x)) + out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x))) + if self.drop_rate > 0: + out = F.dropout(out, p=self.drop_rate, training=self.training) + out = self.conv2(out) + return paddle.add(x if self.equalInOut else self.convShortcut(x), out) + + +class NetworkBlock(nn.Layer): + def __init__(self, + nb_layers, + in_planes, + out_planes, + block, + stride, + drop_rate=0.0, + activate_before_residual=False): + super(NetworkBlock, self).__init__() + self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, + stride, drop_rate, + activate_before_residual) + + def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, + drop_rate, activate_before_residual): + layers = [] + for i in range(int(nb_layers)): + layers.append( + block(i == 0 and in_planes or out_planes, out_planes, i == 0 + and stride or 1, drop_rate, activate_before_residual)) + return nn.Sequential(*layers) + + def forward(self, x): + return self.layer(x) + + +class Normalize(nn.Layer): + """ Ln normalization copied from + https://github.com/salesforce/CoMatch + """ + + def __init__(self, power=2): + super(Normalize, self).__init__() + self.power = power + + def forward(self, x): + norm = x.pow(self.power).sum(1, keepdim=True).pow(1. / self.power) + out = x.divide(norm) + return out + + +class Wide_ResNet(nn.Layer): + def __init__(self, + num_classes, + depth=28, + widen_factor=2, + drop_rate=0.0, + proj=False, + proj_after=False, + low_dim=64): + super(Wide_ResNet, self).__init__() + # prepare self values + self.widen_factor = widen_factor + self.depth = depth + self.drop_rate = drop_rate + # if use projection head + self.proj = proj + # if use the output of projection head for classification + self.proj_after = proj_after + self.low_dim = low_dim + + channels = [ + 16, 16 * widen_factor, 32 * widen_factor, 64 * widen_factor + ] + assert ((depth - 4) % 6 == 0) + n = (depth - 4) / 6 + block = BasicBlock + # 1st conv before any network block + self.conv1 = nn.Conv2D( + 3, + channels[0], + kernel_size=3, + stride=1, + padding=1, + bias_attr=False) + # 1st block + self.block1 = NetworkBlock( + n, + channels[0], + channels[1], + block, + 1, + drop_rate, + activate_before_residual=True) + # 2nd block + self.block2 = NetworkBlock(n, channels[1], channels[2], block, 2, + drop_rate) + # 3rd block + self.block3 = NetworkBlock(n, channels[2], channels[3], block, 2, + drop_rate) + # global average pooling and classifier + self.bn1 = nn.BatchNorm2D(channels[3], momentum=0.999) + self.relu = nn.LeakyReLU(negative_slope=0.1) + + # if proj after means we classify after projection head + # so we must change the in channel to low_dim of laster fc + if self.proj_after: + self.fc = nn.Linear(self.low_dim, num_classes) + else: + self.fc = nn.Linear(channels[3], num_classes) + self.channels = channels[3] + + # projection head + if self.proj: + self.l2norm = Normalize(2) + + self.fc1 = nn.Linear(64 * self.widen_factor, + 64 * self.widen_factor) + self.relu_mlp = nn.LeakyReLU(negative_slope=0.1) + self.fc2 = nn.Linear(64 * self.widen_factor, self.low_dim) + + def forward(self, x): + feat = self.conv1(x) + feat = self.block1(feat) + feat = self.block2(feat) + feat = self.block3(feat) + feat = self.relu(self.bn1(feat)) + feat = F.adaptive_avg_pool2d(feat, 1) + feat = paddle.reshape(feat, [-1, self.channels]) + + if self.proj: + pfeat = self.fc1(feat) + pfeat = self.relu_mlp(pfeat) + pfeat = self.fc2(pfeat) + pfeat = self.l2norm(pfeat) + + # if projection after classifiy, we classify last + if self.proj_after: + out = self.fc(pfeat) + else: + out = self.fc(feat) + + return out, pfeat + + # output + out = self.fc(feat) + return out + + +def WideResNet(depth, + widen_factor, + dropout, + num_classes, + proj=False, + low_dim=64, + **kwargs): + return Wide_ResNet( + depth=depth, + widen_factor=widen_factor, + drop_rate=dropout, + num_classes=num_classes, + proj=proj, + low_dim=low_dim, + **kwargs) diff --git a/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_250.yaml b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_250.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e8e00e8c85410786451bea66cc86f6a2df1073f6 --- /dev/null +++ b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_250.yaml @@ -0,0 +1,175 @@ +# global configs +Global: + checkpoints: null + pretrained_model: '../test/torch2paddle_cifar10' + output_dir: ./output_25 + device: gpu + save_interval: -1 + eval_during_train: True + eval_interval: 1 + epochs: 1024 + iter_per_epoch: 1024 + print_batch_step: 20 + use_visualdl: False + use_dali: False + train_mode: fixmatch + # used for static mode and model export + image_shape: [3, 224, 224] + save_inference_dir: ./inference + +SSL: + tempture: 1 + threshold: 0.95 + +EMA: + decay: 0.999 + +# AMP: +# scale_loss: 65536 +# use_dynamic_loss_scaling: True +# # O1: mixed fp16 +# level: O1 + +# model architecture +Arch: + name: WideResNet + depth: 28 + widen_factor: 2 + dropout: 0 + num_classes: 10 + +# loss function config for traing/eval process +Loss: + Train: + - CELoss: + weight: 1.0 + reduction: "mean" + Eval: + - CELoss: + weight: 1.0 +UnLabelLoss: + Train: + - CELoss: + weight: 1.0 + reduction: "none" + +Optimizer: + name: Momentum + momentum: 0.9 + use_nesterov: True + no_weight_decay_name: bn bias + weight_decay: 0.0005 + lr: + name: CosineFixmatch + learning_rate: 0.03 + num_warmup_steps: 0 + num_cycles: 0.4375 + +# data loader for train and eval +DataLoader: + Train: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: 25 + expand_labels: 263 + transform_ops: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: True + shuffle: True + loader: + num_workers: 8 + use_shared_memory: True + + UnLabelTrain: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: None + transform_ops_weak: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + transform_ops_strong: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - RandAugment: + num_layers: 2 + magnitude: 10 + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 448 + drop_last: True + shuffle: True + loader: + num_workers: 8 + use_shared_memory: True + + + Eval: + dataset: + name: Cifar10 + data_file: None + mode: 'test' + download: True + backend: 'pil' + sample_per_label: None + transform_ops: + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: False + shuffle: True + loader: + num_workers: 4 + use_shared_memory: True + + +Metric: + Eval: + - TopkAcc: + topk: [1, 5] \ No newline at end of file diff --git a/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0327fcd9c32b56d1b081261dd99fee832ffff330 --- /dev/null +++ b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_40.yaml @@ -0,0 +1,175 @@ +# global configs +Global: + checkpoints: null + pretrained_model: 'https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/others/torch2paddle_weight/torch2paddle_initialize_cifar10_WideResNet_depth28_widenfactor2_classnum10.pdparams' + output_dir: ./output + device: gpu + save_interval: -1 + eval_during_train: True + eval_interval: 1 + epochs: 1024 + iter_per_epoch: 1024 + print_batch_step: 20 + use_visualdl: False + use_dali: False + train_mode: fixmatch + # used for static mode and model export + image_shape: [3, 224, 224] + save_inference_dir: ./inference + +SSL: + tempture: 1 + threshold: 0.95 + +EMA: + decay: 0.999 + +# AMP: +# scale_loss: 65536 +# use_dynamic_loss_scaling: True +# # O1: mixed fp16 +# level: O1 + +# model architecture +Arch: + name: WideResNet + depth: 28 + widen_factor: 2 + dropout: 0 + num_classes: 10 + +# loss function config for traing/eval process +Loss: + Train: + - CELoss: + weight: 1.0 + reduction: "mean" + Eval: + - CELoss: + weight: 1.0 +UnLabelLoss: + Train: + - CELoss: + weight: 1.0 + reduction: "none" + +Optimizer: + name: Momentum + momentum: 0.9 + use_nesterov: True + no_weight_decay_name: bn bias + weight_decay: 0.0005 + lr: + name: CosineFixmatch + learning_rate: 0.03 + num_warmup_steps: 0 + num_cycles: 0.4375 + +# data loader for train and eval +DataLoader: + Train: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: 4 + expand_labels: 1639 + transform_ops: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: True + shuffle: True + loader: + num_workers: 8 + use_shared_memory: True + + UnLabelTrain: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: None + transform_ops_weak: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + transform_ops_strong: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - RandAugment: + num_layers: 2 + magnitude: 10 + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 448 + drop_last: True + shuffle: True + loader: + num_workers: 8 + use_shared_memory: True + + + Eval: + dataset: + name: Cifar10 + data_file: None + mode: 'test' + download: True + backend: 'pil' + sample_per_label: None + transform_ops: + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: False + shuffle: True + loader: + num_workers: 4 + use_shared_memory: True + + +Metric: + Eval: + - TopkAcc: + topk: [1, 5] \ No newline at end of file diff --git a/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_4000.yaml b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_4000.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0989fc89e3b9040db37248d4859a95f0e0def679 --- /dev/null +++ b/ppcls/configs/ssl/FixMatch/FixMatch_cifar10_4000.yaml @@ -0,0 +1,175 @@ +# global configs +Global: + checkpoints: null + pretrained_model: 'https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/others/torch2paddle_weight/torch2paddle_initialize_cifar10_WideResNet_depth28_widenfactor2_classnum10.pdparams' + output_dir: ./output + device: gpu + save_interval: -1 + eval_during_train: True + eval_interval: 1 + epochs: 1024 + iter_per_epoch: 1024 + print_batch_step: 20 + use_visualdl: False + use_dali: False + train_mode: fixmatch + # used for static mode and model export + image_shape: [3, 224, 224] + save_inference_dir: ./inference + +SSL: + tempture: 1 + threshold: 0.95 + +EMA: + decay: 0.999 + +# AMP: +# scale_loss: 65536 +# use_dynamic_loss_scaling: True +# # O1: mixed fp16 +# level: O1 + +# model architecture +Arch: + name: WideResNet + depth: 28 + widen_factor: 2 + dropout: 0 + num_classes: 10 + +# loss function config for traing/eval process +Loss: + Train: + - CELoss: + weight: 1.0 + reduction: "mean" + Eval: + - CELoss: + weight: 1.0 +UnLabelLoss: + Train: + - CELoss: + weight: 1.0 + reduction: "none" + +Optimizer: + name: Momentum + momentum: 0.9 + use_nesterov: True + no_weight_decay_name: bn bias + weight_decay: 0.0005 + lr: + name: CosineFixmatch + learning_rate: 0.03 + num_warmup_steps: 0 + num_cycles: 0.4375 + +# data loader for train and eval +DataLoader: + Train: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: 400 + expand_labels: 17 + transform_ops: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: True + shuffle: True + loader: + num_workers: 4 + use_shared_memory: True + + UnLabelTrain: + dataset: + name: Cifar10 + data_file: None + mode: 'train' + download: True + backend: 'pil' + sample_per_label: None + transform_ops_weak: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + transform_ops_strong: + - RandFlipImage: + flip_code: 1 + - Pad_paddle_vision: + padding: 4 + padding_mode: reflect + - RandCropImageV2: + size: [32, 32] + - RandAugment: + num_layers: 2 + magnitude: 10 + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 448 + drop_last: True + shuffle: True + loader: + num_workers: 4 + use_shared_memory: True + + + Eval: + dataset: + name: Cifar10 + data_file: None + mode: 'test' + download: True + backend: 'pil' + sample_per_label: None + transform_ops: + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.4914, 0.4822, 0.4465] + std: [0.2471, 0.2435, 0.2616] + order: hwc + sampler: + name: DistributedBatchSampler + batch_size: 64 + drop_last: False + shuffle: True + loader: + num_workers: 4 + use_shared_memory: True + + +Metric: + Eval: + - TopkAcc: + topk: [1, 5] \ No newline at end of file diff --git a/ppcls/data/__init__.py b/ppcls/data/__init__.py index 6745f41f263b0b1ac46cc462d9d90b68c6a09eb7..27f1cbaf03a6f1e9f947113d8db4552bae744f0c 100644 --- a/ppcls/data/__init__.py +++ b/ppcls/data/__init__.py @@ -32,6 +32,7 @@ from ppcls.data.dataloader.multi_scale_dataset import MultiScaleDataset from ppcls.data.dataloader.person_dataset import Market1501, MSMT17 from ppcls.data.dataloader.face_dataset import FiveValidationDataset, AdaFaceDataset from ppcls.data.dataloader.custom_label_dataset import CustomLabelDataset +from ppcls.data.dataloader.cifar import Cifar10, Cifar100 # sampler from ppcls.data.dataloader.DistributedRandomIdentitySampler import DistributedRandomIdentitySampler @@ -67,8 +68,9 @@ def create_operators(params, class_num=None): def build_dataloader(config, mode, device, use_dali=False, seed=None): assert mode in [ - 'Train', 'Eval', 'Test', 'Gallery', 'Query' - ], "Dataset mode should be Train, Eval, Test, Gallery, Query" + 'Train', 'Eval', 'Test', 'Gallery', 'Query', 'UnLabelTrain' + ], "Dataset mode should be Train, Eval, Test, Gallery, Query, UnLabelTrain" + assert mode in config.keys(), "{} config not in yaml".format(mode) # build dataset if use_dali: from ppcls.data.dataloader.dali import dali_dataloader diff --git a/ppcls/data/dataloader/__init__.py b/ppcls/data/dataloader/__init__.py index 63787a01f38d06e9020338c1a988a4a84f22f9e8..bb87c2f87d64cd83e94e81603ac251e6f983a290 100644 --- a/ppcls/data/dataloader/__init__.py +++ b/ppcls/data/dataloader/__init__.py @@ -12,3 +12,4 @@ from ppcls.data.dataloader.pk_sampler import PKSampler from ppcls.data.dataloader.person_dataset import Market1501, MSMT17 from ppcls.data.dataloader.face_dataset import AdaFaceDataset, FiveValidationDataset from ppcls.data.dataloader.custom_label_dataset import CustomLabelDataset +from ppcls.data.dataloader.cifar import Cifar10, Cifar100 diff --git a/ppcls/data/dataloader/cifar.py b/ppcls/data/dataloader/cifar.py new file mode 100644 index 0000000000000000000000000000000000000000..0522ed7d1de4036d5f1389815532504ec74580ca --- /dev/null +++ b/ppcls/data/dataloader/cifar.py @@ -0,0 +1,115 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import print_function +import numpy as np +import cv2 +from ppcls.data import preprocess +from ppcls.data.preprocess import transform +from ppcls.data.dataloader.common_dataset import create_operators +from paddle.vision.datasets import Cifar10 as Cifar10_paddle +from paddle.vision.datasets import Cifar100 as Cifar100_paddle + + +class Cifar10(Cifar10_paddle): + def __init__(self, + data_file=None, + mode='train', + download=True, + backend='cv2', + sample_per_label=None, + expand_labels=1, + transform_ops=None, + transform_ops_weak=None, + transform_ops_strong=None): + super().__init__(data_file, mode, None, download, backend) + assert isinstance(expand_labels, int) + self._transform_ops = create_operators(transform_ops) + self._transform_ops_weak = create_operators(transform_ops_weak) + self._transform_ops_strong = create_operators(transform_ops_strong) + self.class_num = 10 + labels = [] + for x in self.data: + labels.append(x[1]) + labels = np.array(labels) + if isinstance(sample_per_label, int): + index = [] + for i in range(self.class_num): + idx = np.where(labels == i)[0] + idx = np.random.choice(idx, sample_per_label, False) + index.extend(idx) + index = index * expand_labels + data = [self.data[x] for x in index] + self.data = data + + def __getitem__(self, idx): + (image, label) = super().__getitem__(idx) + if self._transform_ops: + image1 = transform(image, self._transform_ops) + image1 = image1.transpose((2, 0, 1)) + return (image1, np.int64(label)) + elif self._transform_ops_weak and self._transform_ops_strong: + image2 = transform(image, self._transform_ops_weak) + image2 = image2.transpose((2, 0, 1)) + image3 = transform(image, self._transform_ops_strong) + image3 = image3.transpose((2, 0, 1)) + + return (image2, image3, np.int64(label)) + + +class Cifar100(Cifar100_paddle): + def __init__(self, + data_file=None, + mode='train', + download=True, + backend='pil', + sample_per_label=None, + expand_labels=1, + transform_ops=None, + transform_ops_weak=None, + transform_ops_strong=None): + super().__init__(data_file, mode, None, download, backend) + assert isinstance(expand_labels, int) + self._transform_ops = create_operators(transform_ops) + self._transform_ops_weak = create_operators(transform_ops_weak) + self._transform_ops_strong = create_operators(transform_ops_strong) + self.class_num = 100 + + labels = [] + for x in self.data: + labels.append(x[1]) + labels = np.array(labels) + if isinstance(sample_per_label, int): + index = [] + for i in range(self.class_num): + idx = np.where(labels == i)[0] + idx = np.random.choice(idx, sample_per_label, False) + index.extend(idx) + index = index * expand_labels + data = [self.data[x] for x in index] + self.data = data + + def __getitem__(self, idx): + (image, label) = super().__getitem__(idx) + if self._transform_ops: + image1 = transform(image, self._transform_ops) + image1 = image1.transpose((2, 0, 1)) + return (image1, np.int64(label)) + elif self._transform_ops_weak and self._transform_ops_strong: + image2 = transform(image, self._transform_ops_weak) + image2 = image2.transpose((2, 0, 1)) + image3 = transform(image, self._transform_ops_strong) + image3 = image3.transpose((2, 0, 1)) + + return (image2, image3, np.int64(label)) \ No newline at end of file diff --git a/ppcls/data/dataloader/common_dataset.py b/ppcls/data/dataloader/common_dataset.py index 292f509fab780709fb16b7d4017facf4f43801df..7530137eb18ef47aca3a010c7f623fc89f2ca7d9 100644 --- a/ppcls/data/dataloader/common_dataset.py +++ b/ppcls/data/dataloader/common_dataset.py @@ -30,6 +30,8 @@ def create_operators(params): Args: params(list): a dict list, used to create some operators """ + if params is None: + return None assert isinstance(params, list), ('operator config should be a list') ops = [] for operator in params: diff --git a/ppcls/data/preprocess/__init__.py b/ppcls/data/preprocess/__init__.py index 421e0372699448dfd6cb6cc671dd81ad08fd7095..f2f041ba1fcf3e272886afd567f58eeb7d83f79e 100644 --- a/ppcls/data/preprocess/__init__.py +++ b/ppcls/data/preprocess/__init__.py @@ -44,6 +44,7 @@ from ppcls.data.preprocess.ops.operators import RandomRotation from ppcls.data.preprocess.ops.operators import Padv2 from ppcls.data.preprocess.ops.operators import RandomRot90 from .ops.operators import format_data +from paddle.vision.transforms import Pad as Pad_paddle_vision from ppcls.data.preprocess.batch_ops.batch_operators import MixupOperator, CutmixOperator, OpSampler, FmixOperator from ppcls.data.preprocess.batch_ops.batch_operators import MixupCutmixHybrid diff --git a/ppcls/engine/engine.py b/ppcls/engine/engine.py index 27bc2ece38c0829b497ac2feb53c087f5fe4cca2..95f264058d36cb7408c254d5975787451c24b691 100644 --- a/ppcls/engine/engine.py +++ b/ppcls/engine/engine.py @@ -41,7 +41,7 @@ from ppcls.utils import save_load from ppcls.data.utils.get_image_list import get_image_list from ppcls.data.postprocess import build_postprocess from ppcls.data import create_operators -from ppcls.engine.train import train_epoch +from ppcls.engine import train as train_method from ppcls.engine.train.utils import type_name from ppcls.engine import evaluation from ppcls.arch.gears.identity_head import IdentityHead @@ -54,6 +54,7 @@ class Engine(object): self.config = config self.eval_mode = self.config["Global"].get("eval_mode", "classification") + self.train_mode = self.config["Global"].get("train_mode", None) if "Head" in self.config["Arch"] or self.config["Arch"].get("is_rec", False): self.is_rec = True @@ -79,7 +80,11 @@ class Engine(object): assert self.eval_mode in [ "classification", "retrieval", "adaface" ], logger.error("Invalid eval mode: {}".format(self.eval_mode)) - self.train_epoch_func = train_epoch + if self.train_mode is None: + self.train_epoch_func = train_method.train_epoch + else: + self.train_epoch_func = getattr(train_method, + "train_epoch_" + self.train_mode) self.eval_func = getattr(evaluation, self.eval_mode + "_eval") self.use_dali = self.config['Global'].get("use_dali", False) @@ -119,6 +124,20 @@ class Engine(object): if self.mode == 'train': self.train_dataloader = build_dataloader( self.config["DataLoader"], "Train", self.device, self.use_dali) + if self.config["DataLoader"].get('UnLabelTrain', None) is not None: + self.unlabel_train_dataloader = build_dataloader( + self.config["DataLoader"], "UnLabelTrain", self.device, + self.use_dali) + else: + self.unlabel_train_dataloader = None + + self.iter_per_epoch = len(self.train_dataloader) - 1 if platform.system( + ) == "Windows" else len(self.train_dataloader) + if self.config["Global"].get("iter_per_epoch", None): + # set max iteration per epoch mannualy, when training by iteration(s), such as XBM, FixMatch. + self.iter_per_epoch = self.config["Global"].get("iter_per_epoch") + self.iter_per_epoch = self.iter_per_epoch // self.update_freq * self.update_freq + if self.mode == "eval" or (self.mode == "train" and self.config["Global"]["eval_during_train"]): if self.eval_mode in ["classification", "adaface"]: @@ -142,8 +161,11 @@ class Engine(object): # build loss if self.mode == "train": - loss_info = self.config["Loss"]["Train"] - self.train_loss_func = build_loss(loss_info) + label_loss_info = self.config["Loss"]["Train"] + self.train_loss_func = build_loss(label_loss_info) + unlabel_loss_info = self.config.get("UnLabelLoss", {}).get("Train", + None) + self.unlabel_train_loss_func = build_loss(unlabel_loss_info) if self.mode == "eval" or (self.mode == "train" and self.config["Global"]["eval_during_train"]): loss_config = self.config.get("Loss", None) @@ -208,7 +230,7 @@ class Engine(object): if self.mode == 'train': self.optimizer, self.lr_sch = build_optimizer( self.config["Optimizer"], self.config["Global"]["epochs"], - len(self.train_dataloader) // self.update_freq, + self.iter_per_epoch // self.update_freq, [self.model, self.train_loss_func]) # AMP training and evaluating @@ -345,14 +367,6 @@ class Engine(object): if metric_info is not None: best_metric.update(metric_info) - self.max_iter = len(self.train_dataloader) - 1 if platform.system( - ) == "Windows" else len(self.train_dataloader) - if self.config["Global"].get("iter_per_epoch", None): - # set max iteration per epoch mannualy, when training by iteration(s), such as XBM, FixMatch. - self.max_iter = self.config["Global"].get("iter_per_epoch") - - self.max_iter = self.max_iter // self.update_freq * self.update_freq - for epoch_id in range(best_metric["epoch"] + 1, self.config["Global"]["epochs"] + 1): acc = 0.0 @@ -431,7 +445,7 @@ class Engine(object): writer=self.vdl_writer) # save model - if epoch_id % save_interval == 0: + if save_interval > 0 and epoch_id % save_interval == 0: save_load.save_model( self.model, self.optimizer, {"metric": acc, diff --git a/ppcls/engine/train/__init__.py b/ppcls/engine/train/__init__.py index 800d3a41edfa4c3ba2ad8c9295d66c4acfe1ea5d..d22cbc4295b8ee9db990baa237b5abdae36c34c0 100644 --- a/ppcls/engine/train/__init__.py +++ b/ppcls/engine/train/__init__.py @@ -12,3 +12,4 @@ # See the License for the specific language governing permissions and # limitations under the License. from ppcls.engine.train.train import train_epoch +from ppcls.engine.train.train_fixmatch import train_epoch_fixmatch \ No newline at end of file diff --git a/ppcls/engine/train/train.py b/ppcls/engine/train/train.py index d2ee6b34a76fb91ca5aed4cca56349d1cf7d0d4b..31f022a891fc5d024e669fbb2152142b03ec3a61 100644 --- a/ppcls/engine/train/train.py +++ b/ppcls/engine/train/train.py @@ -25,7 +25,7 @@ def train_epoch(engine, epoch_id, print_batch_step): if not hasattr(engine, "train_dataloader_iter"): engine.train_dataloader_iter = iter(engine.train_dataloader) - for iter_id in range(engine.max_iter): + for iter_id in range(engine.iter_per_epoch): # fetch data batch from dataloader try: batch = engine.train_dataloader_iter.next() diff --git a/ppcls/engine/train/train_fixmatch.py b/ppcls/engine/train/train_fixmatch.py new file mode 100644 index 0000000000000000000000000000000000000000..20e38f9bb99860c7526c5701204fe368a586652d --- /dev/null +++ b/ppcls/engine/train/train_fixmatch.py @@ -0,0 +1,165 @@ +# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from __future__ import absolute_import, division, print_function + +import time +import paddle +from ppcls.engine.train.utils import update_loss, update_metric, log_info +from ppcls.utils import profiler +from paddle.nn import functional as F +import numpy as np + + +def train_epoch_fixmatch(engine, epoch_id, print_batch_step): + tic = time.time() + if not hasattr(engine, "train_dataloader_iter"): + engine.train_dataloader_iter = iter(engine.train_dataloader) + engine.unlabel_train_dataloader_iter = iter( + engine.unlabel_train_dataloader) + temperture = engine.config["SSL"].get("temperture", 1) + threshold = engine.config["SSL"].get("threshold", 0.95) + assert engine.iter_per_epoch is not None, "Global.iter_per_epoch need to be set." + threshold = paddle.to_tensor(threshold) + for iter_id in range(engine.iter_per_epoch): + if iter_id >= engine.iter_per_epoch: + break + if iter_id == 5: + for key in engine.time_info: + engine.time_info[key].reset() + try: + label_data_batch = engine.train_dataloader_iter.next() + except Exception: + engine.train_dataloader_iter = iter(engine.train_dataloader) + label_data_batch = engine.train_dataloader_iter.next() + try: + unlabel_data_batch = engine.unlabel_train_dataloader_iter.next() + except Exception: + engine.unlabel_train_dataloader_iter = iter( + engine.unlabel_train_dataloader) + unlabel_data_batch = engine.unlabel_train_dataloader_iter.next() + assert len(unlabel_data_batch) == 3 + assert unlabel_data_batch[0].shape == unlabel_data_batch[1].shape + engine.time_info["reader_cost"].update(time.time() - tic) + batch_size = label_data_batch[0].shape[0] + unlabel_data_batch[0].shape[0] \ + + unlabel_data_batch[1].shape[0] + engine.global_step += 1 + + # make inputs + inputs_x, targets_x = label_data_batch + inputs_u_w, inputs_u_s, targets_u = unlabel_data_batch + batch_size_label = inputs_x.shape[0] + inputs = paddle.concat([inputs_x, inputs_u_w, inputs_u_s], axis=0) + + # image input + if engine.amp: + amp_level = engine.config['AMP'].get("level", "O1").upper() + with paddle.amp.auto_cast( + custom_black_list={ + "flatten_contiguous_range", "greater_than" + }, + level=amp_level): + loss_dict, logits_label = get_loss( + engine, inputs, batch_size_label, temperture, threshold, + targets_x) + else: + loss_dict, logits_label = get_loss(engine, inputs, + batch_size_label, temperture, + threshold, targets_x) + + # loss + loss = loss_dict["loss"] + + # backward & step opt + if engine.amp: + scaled = engine.scaler.scale(loss) + scaled.backward() + + for i in range(len(engine.optimizer)): + engine.scaler.minimize(engine.optimizer[i], scaled) + else: + loss.backward() + for i in range(len(engine.optimizer)): + engine.optimizer[i].step() + + # step lr(by step) + for i in range(len(engine.lr_sch)): + if not getattr(engine.lr_sch[i], "by_epoch", False): + engine.lr_sch[i].step() + # clear grad + for i in range(len(engine.optimizer)): + engine.optimizer[i].clear_grad() + + # update ema + if engine.ema: + engine.model_ema.update(engine.model) + + # below code just for logging + # update metric_for_logger + update_metric(engine, logits_label, label_data_batch, batch_size) + # update_loss_for_logger + update_loss(engine, loss_dict, batch_size) + engine.time_info["batch_cost"].update(time.time() - tic) + if iter_id % print_batch_step == 0: + log_info(engine, batch_size, epoch_id, iter_id) + tic = time.time() + + # step lr(by epoch) + for i in range(len(engine.lr_sch)): + if getattr(engine.lr_sch[i], "by_epoch", False): + engine.lr_sch[i].step() + + +def get_loss(engine, inputs, batch_size_label, temperture, threshold, + targets_x): + # For pytroch version, inputs need to use interleave and de_interleave + # to reshape and transpose inputs and logits, but it dosen't affect the + # result. So this paddle version dose not use the two transpose func. + # inputs = interleave(inputs, inputs.shape[0] // batch_size_label) + logits = engine.model(inputs) + # logits = de_interleave(logits, inputs.shape[0] // batch_size_label) + logits_x = logits[:batch_size_label] + logits_u_w, logits_u_s = logits[batch_size_label:].chunk(2) + loss_dict_label = engine.train_loss_func(logits_x, targets_x) + probs_u_w = F.softmax(logits_u_w.detach() / temperture, axis=-1) + p_targets_u, mask = get_psuedo_label_and_mask(probs_u_w, threshold) + unlabel_celoss = engine.unlabel_train_loss_func(logits_u_s, + p_targets_u)["CELoss"] + unlabel_celoss = (unlabel_celoss * mask).mean() + loss_dict = dict() + for k, v in loss_dict_label.items(): + if k != "loss": + loss_dict[k + "_label"] = v + loss_dict["CELoss_unlabel"] = unlabel_celoss + loss_dict["loss"] = loss_dict_label['loss'] + unlabel_celoss + return loss_dict, logits_x + + +def get_psuedo_label_and_mask(probs_u_w, threshold): + max_probs = paddle.max(probs_u_w, axis=-1) + p_targets_u = paddle.argmax(probs_u_w, axis=-1) + + mask = paddle.greater_equal(max_probs, threshold).astype('float') + return p_targets_u, mask + + +def interleave(x, size): + s = list(x.shape) + return x.reshape([-1, size] + s[1:]).transpose( + [1, 0, 2, 3, 4]).reshape([-1] + s[1:]) + + +def de_interleave(x, size): + s = list(x.shape) + return x.reshape([size, -1] + s[1:]).transpose( + [1, 0, 2]).reshape([-1] + s[1:]) diff --git a/ppcls/engine/train/utils.py b/ppcls/engine/train/utils.py index 364425035363c4662b9ca847eccde6fe293a8c18..b649c8e8bc5592b50eef3ffb5a6cbbb57ef315f9 100644 --- a/ppcls/engine/train/utils.py +++ b/ppcls/engine/train/utils.py @@ -53,13 +53,14 @@ def log_info(trainer, batch_size, epoch_id, iter_id): ips_msg = "ips: {:.5f} samples/s".format( batch_size / trainer.time_info["batch_cost"].avg) - eta_sec = ( - (trainer.config["Global"]["epochs"] - epoch_id + 1 - ) * trainer.max_iter - iter_id) * trainer.time_info["batch_cost"].avg + + eta_sec = ((trainer.config["Global"]["epochs"] - epoch_id + 1) * + trainer.iter_per_epoch - iter_id) * trainer.time_info["batch_cost"].avg eta_msg = "eta: {:s}".format(str(datetime.timedelta(seconds=int(eta_sec)))) logger.info("[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".format( - epoch_id, trainer.config["Global"]["epochs"], iter_id, - trainer.max_iter, lr_msg, metric_msg, time_msg, ips_msg, eta_msg)) + epoch_id, trainer.config["Global"]["epochs"], iter_id, trainer.iter_per_epoch, + lr_msg, metric_msg, time_msg, ips_msg, eta_msg)) + for i, lr in enumerate(trainer.lr_sch): logger.scaler( diff --git a/ppcls/loss/__init__.py b/ppcls/loss/__init__.py index bdba63d023d872275012ea3746de4c9398e49bb8..513ede138c055bf045481ccd6067a86b8c8166b5 100644 --- a/ppcls/loss/__init__.py +++ b/ppcls/loss/__init__.py @@ -77,6 +77,8 @@ class CombinedLoss(nn.Layer): def build_loss(config): + if config is None: + return None module_class = CombinedLoss(copy.deepcopy(config)) logger.debug("build loss {} success.".format(module_class)) return module_class diff --git a/ppcls/loss/celoss.py b/ppcls/loss/celoss.py index a78926170c6b8edf7d85f62204f34437eeb118b2..81d12a611d760b04d6f556145a00ca3f136b92d1 100644 --- a/ppcls/loss/celoss.py +++ b/ppcls/loss/celoss.py @@ -26,11 +26,13 @@ class CELoss(nn.Layer): Cross entropy loss """ - def __init__(self, epsilon=None): + def __init__(self, reduction="mean", epsilon=None): super().__init__() if epsilon is not None and (epsilon <= 0 or epsilon >= 1): epsilon = None self.epsilon = epsilon + assert reduction in ["mean", "sum", "none"] + self.reduction = reduction def _labelsmoothing(self, target, class_num): if len(target.shape) == 1 or target.shape[-1] != class_num: @@ -55,8 +57,11 @@ class CELoss(nn.Layer): soft_label = True else: soft_label = False - loss = F.cross_entropy(x, label=label, soft_label=soft_label) - loss = loss.mean() + loss = F.cross_entropy( + x, + label=label, + soft_label=soft_label, + reduction=self.reduction) return {"CELoss": loss} diff --git a/ppcls/optimizer/learning_rate.py b/ppcls/optimizer/learning_rate.py index 58195477fc93cf9972d16708fad568a374423330..8bbeb6cee04c4617b8156c65e06a56ccd8df041b 100644 --- a/ppcls/optimizer/learning_rate.py +++ b/ppcls/optimizer/learning_rate.py @@ -14,10 +14,10 @@ from __future__ import (absolute_import, division, print_function, unicode_literals) +import math import types from abc import abstractmethod from typing import Union - from paddle.optimizer import lr from ppcls.utils import logger @@ -421,7 +421,6 @@ class ReduceOnPlateau(LRBase): last_epoch (int, optional): last epoch. Defaults to -1. by_epoch (bool, optional): learning rate decays by epoch when by_epoch is True, else by iter. Defaults to False. """ - def __init__(self, epochs, step_each_epoch, @@ -475,3 +474,47 @@ class ReduceOnPlateau(LRBase): setattr(learning_rate, "by_epoch", self.by_epoch) return learning_rate + + +class CosineFixmatch(LRBase): + """Cosine decay in FixMatch style + + Args: + epochs (int): total epoch(s) + step_each_epoch (int): number of iterations within an epoch + learning_rate (float): learning rate + num_warmup_steps (int): the number warmup steps. + warmunum_cycles (float, optional): the factor for cosine in FixMatch learning rate. Defaults to 7 / 16. + last_epoch (int, optional): last epoch. Defaults to -1. + by_epoch (bool, optional): learning rate decays by epoch when by_epoch is True, else by iter. Defaults to False. + """ + def __init__(self, + epochs, + step_each_epoch, + learning_rate, + num_warmup_steps, + num_cycles=7 / 16, + last_epoch=-1, + by_epoch=False): + self.epochs = epochs + self.step_each_epoch = step_each_epoch + self.learning_rate = learning_rate + self.num_warmup_steps = num_warmup_steps + self.num_cycles = num_cycles + self.last_epoch = last_epoch + + def __call__(self): + def _lr_lambda(current_step): + if current_step < self.num_warmup_steps: + return float(current_step) / float( + max(1, self.num_warmup_steps)) + no_progress = float(current_step - self.num_warmup_steps) / \ + float(max(1, self.epochs * self.step_each_epoch - self.num_warmup_steps)) + return max(0., math.cos(math.pi * self.num_cycles * no_progress)) + + learning_rate = lr.LambdaDecay( + learning_rate=self.learning_rate, + lr_lambda=_lr_lambda, + last_epoch=self.last_epoch) + setattr(learning_rate, "by_epoch", self.by_epoch) + return learning_rate \ No newline at end of file diff --git a/ppcls/optimizer/optimizer.py b/ppcls/optimizer/optimizer.py index c446fc1dec9e01fe85dfac68dee829082655826a..8206846b4a5586ad3bee4a1aa6033b55c5964da7 100644 --- a/ppcls/optimizer/optimizer.py +++ b/ppcls/optimizer/optimizer.py @@ -93,24 +93,49 @@ class Momentum(object): momentum, weight_decay=None, grad_clip=None, - multi_precision=True): + use_nesterov=False, + multi_precision=True, + no_weight_decay_name=None): super().__init__() self.learning_rate = learning_rate self.momentum = momentum self.weight_decay = weight_decay self.grad_clip = grad_clip self.multi_precision = multi_precision + self.use_nesterov = use_nesterov + self.no_weight_decay_name_list = no_weight_decay_name.split( + ) if no_weight_decay_name else [] def __call__(self, model_list): # model_list is None in static graph - parameters = sum([m.parameters() for m in model_list], - []) if model_list else None + parameters = None + if len(self.no_weight_decay_name_list) > 0: + params_with_decay = [] + params_without_decay = [] + for m in model_list: + params = [p for n, p in m.named_parameters() \ + if not any(nd in n for nd in self.no_weight_decay_name_list)] + params_with_decay.extend(params) + params = [p for n, p in m.named_parameters() \ + if any(nd in n for nd in self.no_weight_decay_name_list)] + params_without_decay.extend(params) + parameters = [{ + "params": params_with_decay, + "weight_decay": self.weight_decay + }, { + "params": params_without_decay, + "weight_decay": 0.0 + }] + else: + parameters = sum([m.parameters() for m in model_list], + []) if model_list else None opt = optim.Momentum( learning_rate=self.learning_rate, momentum=self.momentum, weight_decay=self.weight_decay, grad_clip=self.grad_clip, multi_precision=self.multi_precision, + use_nesterov=self.use_nesterov, parameters=parameters) if hasattr(opt, '_use_multi_tensor'): opt = optim.Momentum( @@ -120,6 +145,7 @@ class Momentum(object): grad_clip=self.grad_clip, multi_precision=self.multi_precision, parameters=parameters, + use_nesterov=self.use_nesterov, use_multi_tensor=True) return opt