未验证 提交 601c81d2 编写于 作者: H haoyuying 提交者: GitHub

add transform docs and demo README

上级 794c3794
# PaddleHub 图像着色
## 如何开始Fine-tune
在完成安装PaddlePaddle与PaddleHub后,通过执行`python train.py`即可开始使用user_guided_colorization模型对[Canvas](../../docs/reference/dataset.md#class-hubdatasetsCanvas)等数据集进行Fine-tune。
## 代码步骤
使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
### Step1: 定义数据预处理方式
import paddlehub.vision.transforms as T
transform = T.Compose([T.Resize((176, 176), interpolation='NEAREST'),
T.RGB2LAB()], to_rgb=True)
`transforms` 数据增强模块定义了丰富的数据预处理方式,用户可按照需求替换自己需要的数据预处理方式。
**NOTE:** 要将`T.Compose``to_rgb`设定为True.
### Step2: 下载数据集并使用
from paddlehub.datasets import Canvas
color_set = Canvas(transform=transform, mode='train')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test` 默认为`train`
数据集的准备代码可以参考 [canvas.py](../../paddlehub/datasets/canvas.py)`hub.datasets.Canvas()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
### Step3: 加载预训练模型
model = hub.Module(name='user_guided_colorization', classification=True, prob=1, num_point=None, load_checkpoint=None)
* `name`: 选择预训练模型的名字。
* `classification`: 着色任务分为两阶段训练,第一阶段`classification`设定为True, 用于不加着色块网络的训练,第二阶段`classification`设定为False,用于输入图像加入着色块的训练。
* `prob`: 不加着色块的概率,默认值为1,即不加着色块
* `num_point`: 着色块的数量,默认为None.
* `load_checkpoint`: 是否加载自己训练的模型,若为None,则加载提供的模型默认参数。
### Step4: 选择优化策略和运行配置
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_colorization_ckpt')
trainer.train(color_set, epochs=201, batch_size=25, eval_dataset=color_set, log_interval=10, save_interval=10)
#### 优化策略
Paddle2.0-rc提供了多种优化器选择,如`SGD`, `Adam`, `Adamax`等,详细参见[策略](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/optimizer/optimizer/Optimizer_cn.html)
* `learning_rate`: 全局学习率。默认为1e-4;
* `parameters`: 待优化模型参数。
#### 运行配置
`Trainer` 主要控制Fine-tune的训练,包含以下可控制的参数:
* `model`: 被优化模型;
* `optimizer`: 优化器选择;
* `use_vdl`: 是否使用vdl可视化训练过程;
* `checkpoint_dir`: 保存模型参数的地址;
* `compare_metrics`: 保存最优模型的衡量指标;
`trainer.train` 主要控制具体的训练过程,包含以下可控制的参数:
* `train_dataset`: 训练时所用的数据集;
* `epochs`: 训练轮数;
* `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size;
* `num_workers`: works的数量,默认为0;
* `eval_dataset`: 验证过程所用的数据集;
* `log_interval`: 打印日志的间隔, 单位为执行批训练的次数。
* `save_interval`: 保存模型的间隔频次,单位为执行训练的轮数。
## 模型预测
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='user_guided_colorization', load_checkpoint='/PATH/TO/CHECKPOINT')
result = model.predict(images='house.png', visualization=True, save_path='result')
参数配置正确后,请执行脚本`python predict.py`, 加载模型具体可参见[加载](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/framework/io/load_cn.html#load)
**NOTE:** 进行预测时,所选择的module,checkpoint_dir,dataset必须和Fine-tune所用的一样。若想获取油画风着色效果,请下载参数文件[油画着色](https://paddlehub.bj.bcebos.com/dygraph/models/canvas_rc.pdparams)
* `images`:原始图像路径;
* `visualization`: 是否可视化,默认为True;
* `save_path`: 保存结果的路径,默认为'result'。
...@@ -2,5 +2,5 @@ import paddle ...@@ -2,5 +2,5 @@ import paddle
import paddlehub as hub import paddlehub as hub
if __name__ == '__main__': if __name__ == '__main__':
model = hub.Module(name='user_guided_colorization', load_checkpoint='/PATH/TO/CHECKPOINT') model = hub.Module(name='user_guided_colorization', load_checkpoint='/PATH/TO/CHECKPOINT', prob=0.01)
result = model.predict(images='house.png') result = model.predict(images='house.png')
...@@ -6,12 +6,12 @@ from paddlehub.datasets import Canvas ...@@ -6,12 +6,12 @@ from paddlehub.datasets import Canvas
if __name__ == '__main__': if __name__ == '__main__':
model = hub.Module(name='user_guided_colorization', classification=True, prob= 0.125) model = hub.Module(name='user_guided_colorization', classification=True, prob=1)
transform = T.Compose([T.Resize((256, 256), interpolation='NEAREST'), transform = T.Compose([T.Resize((256, 256), interpolation='NEAREST'),
T.RandomPaddingCrop(crop_size=176), T.RandomPaddingCrop(crop_size=176),
T.RGB2LAB()]) T.RGB2LAB()], to_rgb=True)
color_set = Canvas(transform=transform, mode='train') color_set = Canvas(transform=transform, mode='train')
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters()) optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_colorization_ckpt') trainer = Trainer(model, optimizer, checkpoint_dir='img_colorization_ckpt')
trainer.train(color_set, epochs=101, batch_size=2, eval_dataset=color_set, log_interval=10, save_interval=10) trainer.train(color_set, epochs=201, batch_size=25, eval_dataset=color_set, log_interval=10, save_interval=10)
# PaddleHub 图像分类
## 如何开始Fine-tune
在完成安装PaddlePaddle与PaddleHub后,通过执行`python train.py`即可开始使用resnet50_vd_imagenet_ssld对[Flowers](../../docs/reference/dataset.md#class-hubdatasetsflowers)等数据集进行Fine-tune。
## 代码步骤
使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
### Step1: 定义数据预处理方式
import paddlehub.vision.transforms as T
transforms = T.Compose([T.Resize((224, 224)), T.Normalize()])
`transforms` 数据增强模块定义了丰富的数据预处理方式,用户可按照需求替换自己需要的数据预处理方式。
### Step2: 下载数据集并使用
from paddlehub.datasets import Flowers
flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test`, `val`, 默认为`train`
数据集的准备代码可以参考 [flowers.py](../../paddlehub/datasets/flowers.py)`hub.datasets.Flowers()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
### Step3: 加载预训练模型
module = hub.Module(name="resnet50_vd_imagenet_ssld", label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"])
* `name`: 选择预训练模型的名字。
* `class_dim`: 设置最终输出分类类别。
# 更换name参数即可无缝切换efficientnet模型, 代码示例如下
module = hub.Module(name="efficientnetb7_imagenet")
### Step4: 选择优化策略和运行配置
optimizer = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_classification_ckpt')
trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate, save_interval=1)
#### 优化策略
Paddle2.0-rc提供了多种优化器选择,如`SGD`, `Adam`, `Adamax`等,详细参见[策略](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/optimizer/optimizer/Optimizer_cn.html)
* `learning_rate`: 全局学习率。默认为1e-3;
* `parameters`: 待优化模型参数。
#### 运行配置
`Trainer` 主要控制Fine-tune的训练,包含以下可控制的参数:
* `model`: 被优化模型;
* `optimizer`: 优化器选择;
* `use_vdl`: 是否使用vdl可视化训练过程;
* `checkpoint_dir`: 保存模型参数的地址;
* `compare_metrics`: 保存最优模型的衡量指标;
`trainer.train` 主要控制具体的训练过程,包含以下可控制的参数:
* `train_dataset`: 训练时所用的数据集;
* `epochs`: 训练轮数;
* `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size;
* `num_workers`: works的数量,默认为0;
* `eval_dataset`: 验证集;
* `log_interval`: 打印日志的间隔, 单位为执行批训练的次数。
* `save_interval`: 保存模型的间隔频次,单位为执行训练的轮数。
## 模型预测
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='mobilenet_v2_imagenet', label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"], load_checkpoint=/PATH/TO/CHECKPOINT)
result = model.predict('flower.jpg')
参数配置正确后,请执行脚本`python predict.py`, 加载模型具体可参见[加载](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/framework/io/load_cn.html#load)
**NOTE:** 进行预测时,所选择的module,checkpoint_dir,dataset必须和Fine-tune所用的一样。
...@@ -2,8 +2,5 @@ import paddle ...@@ -2,8 +2,5 @@ import paddle
import paddlehub as hub import paddlehub as hub
if __name__ == '__main__': if __name__ == '__main__':
model = hub.Module(name='resnet50_vd_imagenet_ssld', label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"], load_checkpoint=None)
model = hub.Module(name='mobilenet_v2_imagenet', class_dim=5) result, feature = model.predict('flower.jpg')
state_dict = paddle.load('img_classification_ckpt') \ No newline at end of file
result = model.predict('flower.jpg')
...@@ -6,13 +6,9 @@ from paddlehub.datasets import Flowers ...@@ -6,13 +6,9 @@ from paddlehub.datasets import Flowers
if __name__ == '__main__': if __name__ == '__main__':
transforms = T.Compose([T.Resize((224, 224)), T.Normalize()]) transforms = T.Compose([T.Resize((224, 224)), T.Normalize()])
flowers = Flowers(transforms) flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val') flowers_validate = Flowers(transforms, mode='val')
model = hub.Module(name='resnet50_vd_imagenet_ssld', label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"], load_checkpoint=None)
model = hub.Module(name='mobilenet_v2_imagenet', class_dim=flowers.num_classes)
optimizer = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()) optimizer = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_classification_ckpt') trainer = Trainer(model, optimizer, checkpoint_dir='img_classification_ckpt')
trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate, save_interval=10)
trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate, save_interval=1) \ No newline at end of file
# PaddleHub 图像风格迁移
## 如何开始Fine-tune
在完成安装PaddlePaddle与PaddleHub后,通过执行`python train.py`即可开始使用msgnet模型对[MiniCOCO](../../docs/reference/dataset.md#class-hubdatasetsMiniCOCO)等数据集进行Fine-tune。
## 代码步骤
使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
### Step1: 定义数据预处理方式
import paddlehub.vision.transforms as T
transform = T.Compose([T.Resize((256, 256), interpolation='LINEAR')])
`transforms` 数据增强模块定义了丰富的数据预处理方式,用户可按照需求替换自己需要的数据预处理方式。
### Step2: 下载数据集并使用
from paddlehub.datasets.minicoco import MiniCOCO
color_set = MiniCOCO(transform=transform, mode='train')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test`, 默认为`train`
数据集的准备代码可以参考 [minicoco.py](../../paddlehub/datasets/flowers.py)`hub.datasets.MiniCOCO()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
### Step3: 加载预训练模型
model = hub.Module(name='msgnet', load_checkpoint=None)
* `name`: 选择预训练模型的名字。
* `load_checkpoint`: 是否加载自己训练的模型,若为None,则加载提供的模型默认参数。
### Step4: 选择优化策略和运行配置
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='test_style_ckpt')
trainer.train(styledata, epochs=101, batch_size=4, eval_dataset=styledata, log_interval=10, save_interval=10)
#### 优化策略
Paddle2.0-rc提供了多种优化器选择,如`SGD`, `Adam`, `Adamax`等,详细参见[策略](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/optimizer/optimizer/Optimizer_cn.html)
* `learning_rate`: 初始学习率,数据类型为Python float;
* `power`: 多项式的幂,默认值为1.0;
* `decay_steps`: 衰减步数。必须是正整数,该参数确定衰减周期。
* `learning_rate`: 全局学习率。默认为1e-4;
* `parameters`: 待优化模型参数。
#### 运行配置
`Trainer` 主要控制Fine-tune的训练,包含以下可控制的参数:
* `model`: 被优化模型;
* `optimizer`: 优化器选择;
* `use_vdl`: 是否使用vdl可视化训练过程;
* `checkpoint_dir`: 保存模型参数的地址;
* `compare_metrics`: 保存最优模型的衡量指标;
`trainer.train` 主要控制具体的训练过程,包含以下可控制的参数:
* `train_dataset`: 训练时所用的数据集;
* `epochs`: 训练轮数;
* `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size;
* `num_workers`: works的数量,默认为0;
* `eval_dataset`: 验证集;
* `log_interval`: 打印日志的间隔, 单位为执行批训练的次数。
* `save_interval`: 保存模型的间隔频次,单位为执行训练的轮数。
## 模型预测
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='msgnet', load_checkpoint=/PATH/TO/CHECKPOINT)
result = model.predict(origin="venice-boat.jpg", style="candy.jpg", visualization=True, save_path ='result')
参数配置正确后,请执行脚本`python predict.py`, 加载模型具体可参见[加载](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/framework/io/load_cn.html#load)
* `origin`:原始图像路径;
* `style`: 风格图像路径;
* `visualization`: 是否可视化,默认为True;
* `save_path`: 保存结果的路径,默认为'result'。
**NOTE:** 进行预测时,所选择的module,checkpoint_dir,dataset必须和Fine-tune所用的一样。
# Class `hub.vision.transforms.Compose`
transforms: Callable,
to_rgb: bool = False)
Compose preprocessing operators for obtaining prepocessed data. The shape of input image for all operations is [H, W, C], where H is the image height, W is the image width, and C is the number of image channels.
* transforms(callmethod) : The method of preprocess images.
* to_rgb(bool): Whether to transform the input from BGR mode to RGB mode, default is False.
# Class `hub.vision.transforms.RandomHorizontalFlip`
hub.vision.transforms.RandomHorizontalFlip(prob: float = 0.5)
Randomly flip the image horizontally according to given probability.
* prob(float): The probability for flipping the image horizontally, default is 0.5.
# Class `hub.vision.transforms.RandomVerticalFlip`
prob: float = 0.5)
Randomly flip the image vertically according to given probability.
* prob(float): The probability for flipping the image vertically, default is 0.5.
# Class `hub.vision.transforms.Resize`
target_size: Union[List[int], int],
interpolation: str = 'LINEAR')
Resize input image to target size.
* target_size(List[int]|int]): Target image size.
* interpolation(str): Interpolation mode, default is 'LINEAR'. It support 6 modes: 'NEAREST', 'LINEAR', 'CUBIC', 'AREA', 'LANCZOS4' and 'RANDOM'.
# Class `hub.vision.transforms.ResizeByLong`
hub.vision.transforms.ResizeByLong(long_size: int)
Resize the long side of the input image to the target size.
* long_size(int|list[int]): The target size of long side.
# Class `hub.vision.transforms.ResizeRangeScaling`
min_value: int = 400,
max_value: int = 600)
Randomly select a targeted size to resize the image according to given range.
* min_value(int): The minimum value for targeted size.
* max_value(int): The maximum value for targeted size.
# Class `hub.vision.transforms.ResizeStepScaling`
min_scale_factor: float = 0.75,
max_scale_factor: float = 1.25,
scale_step_size: float = 0.25)
Randomly select a scale factor to resize the image according to given range.
* min_scale_factor(float): The minimum scale factor for targeted scale.
* max_scale_factor(float): The maximum scale factor for targeted scale.
* scale_step_size(float): Scale interval.
# Class `hub.vision.transforms.Normalize`
mean: list = [0.5, 0.5, 0.5],
std: list =[0.5, 0.5, 0.5])
Normalize the input image.
* mean(list): Mean value for normalization.
* std(list): Standard deviation for normalization.
# Class `hub.vision.transforms.Padding`
target_size: Union[List[int], Tuple[int], int],
im_padding_value: list = [127.5, 127.5, 127.5])
Padding input into targeted size according to specific padding value.
* target_size(Union[List[int], Tuple[int], int]): Targeted image size.
* im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
# Class `hub.vision.transforms.RandomPaddingCrop`
crop_size(Union[List[int], Tuple[int], int]),
im_padding_value: list = [127.5, 127.5, 127.5])
Padding input image if crop size is greater than image size. Otherwise, crop the input image to given size.
* crop_size(Union[List[int], Tuple[int], int]): Targeted image size.
* im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
# Class `hub.vision.transforms.RandomBlur`
hub.vision.transforms.RandomBlur(prob: float = 0.1)
Random blur input image by Gaussian filter according to given probability.
* prob(float): The probability to blur the image, default is 0.1.
# Class `hub.vision.transforms.RandomRotation`
max_rotation: float = 15.,
im_padding_value: list = [127.5, 127.5, 127.5])
Rotate the input image at random angle. The angle will not exceed to max_rotation.
* max_rotation(float): Upper bound of rotation angle.
* im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
# Class `hub.vision.transforms.RandomDistort`
brightness_range: float = 0.5,
brightness_prob: float = 0.5,
contrast_range: float = 0.5,
contrast_prob: float = 0.5,
saturation_range: float = 0.5,
saturation_prob: float = 0.5,
hue_range: float= 18.,
hue_prob: float= 0.5)
Random adjust brightness, contrast, saturation and hue according to the given random range and probability, respectively.
* brightness_range(float): Boundary of brightness.
* brightness_prob(float): Probability for disturb the brightness of image.
* contrast_range(float): Boundary of contrast.
* contrast_prob(float): Probability for disturb the contrast of image.
* saturation_range(float): Boundary of saturation.
* saturation_prob(float): Probability for disturb the saturation of image.
* hue_range(float): Boundary of hue.
* hue_prob(float): Probability for disturb the hue of image.
# Class `hub.vision.transforms.RGB2LAB`
Convert color space from RGB to LAB.
# Class `hub.vision.transforms.LAB2RGB`
Convert color space from LAB to RGB.
# Class `hub.vision.transforms.CenterCrop`
hub.vision.transforms.CenterCrop(crop_size: int)
Crop the middle part of the image to the specified size.
* crop_size(int): Target size for croped image.
\ No newline at end of file
...@@ -2,9 +2,177 @@ ...@@ -2,9 +2,177 @@
训练一个新任务时,如果从零开始训练时,这将是一个耗时的过程,并且效果可能达不到理想的效果,此时您可以利用PaddleHub提供的预训练模型进行具体任务的Fine-tune。您只需要对自定义数据进行相应的预处理,随后输入预训练模型中,即可得到相应的结果。请参考如下内容设置数据集的结构。 训练一个新任务时,如果从零开始训练时,这将是一个耗时的过程,并且效果可能达不到理想的效果,此时您可以利用PaddleHub提供的预训练模型进行具体任务的Fine-tune。您只需要对自定义数据进行相应的预处理,随后输入预训练模型中,即可得到相应的结果。请参考如下内容设置数据集的结构。
## 一、图像分类数据集 ## 一、图像分类数据集
### 数据准备
├─data: 数据目录
图片1路径 图片1标签
图片2路径 图片2标签
roses/8050213579_48e1e7109f.jpg 0
sunflowers/45045003_30bbd0a142_m.jpg 3
daisy/3415180846_d7b5cced14_m.jpg 2
### 数据集加载
数据集的准备代码可以参考 [flowers.py](../../paddlehub/datasets/flowers.py)`hub.datasets.Flowers()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。具体使用如下:
from paddlehub.datasets import Flowers
flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test`, `val`, 默认为`train`
## 二、图像着色数据集 ## 二、图像着色数据集
## 三、风格迁移数据集 利用PaddleHub迁移着色任务使用自定义数据时,需要切分数据集,将数据集切分为训练集和测试集。
### 数据准备
├─data: 数据目录
PaddleHub为用户提供了用于着色的数据集`Canvas数据集`, 它由1193张莫奈风格和400张梵高风格的图像组成,以[Canvas数据集](../reference/dataset.md)为示例,train文件夹内容如下:
### 数据集加载
数据集的准备代码可以参考 [canvas.py](../../paddlehub/datasets/canvas.py)`hub.datasets.Canvas()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。具体使用如下:
from paddlehub.datasets import Canvas
color_set = Canvas(transforms, mode='train')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test`, 默认为`train`
## 二、风格迁移数据集
### 数据准备
├─data: 数据目录
|- 21styles
PaddleHub为用户提供了用于风格迁移的数据集`Minicoco数据集`, 训练集数据和测试集数据来源于COCO2014, 其中训练集有2001张图片,测试集有200张图片。 `21styles`文件夹下存放着21张不同风格的图片,用户可以根据自己的需求更换不同风格的图片。以[Minicoco数据集](../reference/dataset.md)为示例,train文件夹内容如下:
### 数据集加载
数据集的准备代码可以参考 [minicoco.py](../../paddlehub/datasets/minicoco.py)`hub.datasets.Minicoco()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。具体使用如下:
from paddlehub.datasets import MiniCOCO
ccolor_set = MiniCOCO(transforms, mode='train')
* `transforms`: 数据预处理方式。
* `mode`: 选择数据模式,可选项有 `train`, `test`, 默认为`train`
...@@ -69,7 +69,8 @@ class ImageClassifierModule(RunModule, ImageServing): ...@@ -69,7 +69,8 @@ class ImageClassifierModule(RunModule, ImageServing):
images = batch[0] images = batch[0]
labels = paddle.unsqueeze(batch[1], axis=-1) labels = paddle.unsqueeze(batch[1], axis=-1)
preds = self(images) preds, feature = self(images)
loss, _ = F.softmax_with_cross_entropy(preds, labels, return_softmax=True, axis=1) loss, _ = F.softmax_with_cross_entropy(preds, labels, return_softmax=True, axis=1)
loss = paddle.mean(loss) loss = paddle.mean(loss)
acc = paddle.metric.accuracy(preds, labels) acc = paddle.metric.accuracy(preds, labels)
...@@ -89,10 +90,11 @@ class ImageClassifierModule(RunModule, ImageServing): ...@@ -89,10 +90,11 @@ class ImageClassifierModule(RunModule, ImageServing):
images = self.transforms(images) images = self.transforms(images)
if len(images.shape) == 3: if len(images.shape) == 3:
images = images[np.newaxis, :] images = images[np.newaxis, :]
preds = self(paddle.to_tensor(images)) preds, feature = self(paddle.to_tensor(images))
preds = F.softmax(preds, axis=1).numpy() preds = F.softmax(preds, axis=1).numpy()
pred_idxs = np.argsort(preds)[::-1][:, :top_k] pred_idxs = np.argsort(preds)[::-1][:, :top_k]
res = [] res = []
for i, pred in enumerate(pred_idxs): for i, pred in enumerate(pred_idxs):
res_dict = {} res_dict = {}
for k in pred: for k in pred:
...@@ -73,7 +73,7 @@ def resize_long(im: np.ndarray, long_size: int, interpolation: int = cv2.INTER_L ...@@ -73,7 +73,7 @@ def resize_long(im: np.ndarray, long_size: int, interpolation: int = cv2.INTER_L
Args: Args:
im(np.ndarray): Input image. im(np.ndarray): Input image.
target_size(int|list[int]): The target size of long side. long_size(int|list[int]): The target size of long side.
interpolation(int): Interpolation method. Default to cv2.INTER_LINEAR. interpolation(int): Interpolation method. Default to cv2.INTER_LINEAR.
''' '''
value = max(im.shape[0], im.shape[1]) value = max(im.shape[0], im.shape[1])
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
# limitations under the License. # limitations under the License.
import random import random
from typing import Callable from typing import Callable, Union, List, Tuple
import cv2 import cv2
import PIL import PIL
...@@ -23,7 +23,14 @@ import paddlehub.vision.functional as F ...@@ -23,7 +23,14 @@ import paddlehub.vision.functional as F
class Compose: class Compose:
def __init__(self, transforms, to_rgb=False): """
Compose preprocessing operators for obtaining prepocessed data. The shape of input image for all operations is [H, W, C], where H is the image height, W is the image width, and C is the number of image channels.
transforms(callmethod) : The method of preprocess images.
to_rgb(bool): Whether to transform the input from BGR mode to RGB mode, default is False.
def __init__(self, transforms: Callable, to_rgb: bool = False):
if not isinstance(transforms, list): if not isinstance(transforms, list):
raise TypeError('The transforms must be a list!') raise TypeError('The transforms must be a list!')
if len(transforms) < 1: if len(transforms) < 1:
...@@ -32,7 +39,7 @@ class Compose: ...@@ -32,7 +39,7 @@ class Compose:
self.transforms = transforms self.transforms = transforms
self.to_rgb = to_rgb self.to_rgb = to_rgb
def __call__(self, im): def __call__(self, im: Union[np.ndarray, str]):
if isinstance(im, str): if isinstance(im, str):
im = cv2.imread(im).astype('float32') im = cv2.imread(im).astype('float32')
...@@ -50,26 +57,45 @@ class Compose: ...@@ -50,26 +57,45 @@ class Compose:
class RandomHorizontalFlip: class RandomHorizontalFlip:
def __init__(self, prob=0.5): """
Randomly flip the image horizontally according to given probability.
prob(float): The probability for flipping the image horizontally, default is 0.5.
def __init__(self, prob: float = 0.5):
self.prob = prob self.prob = prob
def __call__(self, im): def __call__(self, im: np.ndarray):
if random.random() < self.prob: if random.random() < self.prob:
im = F.horizontal_flip(im) im = F.horizontal_flip(im)
return im return im
class RandomVerticalFlip: class RandomVerticalFlip:
def __init__(self, prob=0.5): """
Randomly flip the image vertically according to given probability.
prob(float): The probability for flipping the image vertically, default is 0.5.
def __init__(self, prob: float = 0.5):
self.prob = prob self.prob = prob
def __call__(self, im): def __call__(self, im: np.ndarray):
if random.random() < self.prob: if random.random() < self.prob:
im = F.vertical_flip(im) im = F.vertical_flip(im)
return im return im
class Resize: class Resize:
Resize input image to target size.
target_size(List[int]|int]): Target image size.
interpolation(str): Interpolation mode, default is 'LINEAR'. It support 6 modes: 'NEAREST', 'LINEAR', 'CUBIC', 'AREA', 'LANCZOS4' and 'RANDOM'.
# The interpolation mode # The interpolation mode
interpolation_dict = { interpolation_dict = {
...@@ -79,7 +105,7 @@ class Resize: ...@@ -79,7 +105,7 @@ class Resize:
} }
def __init__(self, target_size, interpolation='LINEAR'): def __init__(self, target_size: Union[List[int], int], interpolation: str = 'LINEAR'):
self.interpolation = interpolation self.interpolation = interpolation
if not (interpolation == "RANDOM" or interpolation in self.interpolation_dict): if not (interpolation == "RANDOM" or interpolation in self.interpolation_dict):
raise ValueError("interpolation should be one of {}".format(self.interpolation_dict.keys())) raise ValueError("interpolation should be one of {}".format(self.interpolation_dict.keys()))
...@@ -93,7 +119,7 @@ class Resize: ...@@ -93,7 +119,7 @@ class Resize:
self.target_size = target_size self.target_size = target_size
def __call__(self, im): def __call__(self, im: np.ndarray):
if self.interpolation == "RANDOM": if self.interpolation == "RANDOM":
interpolation = random.choice(list(self.interpolation_dict.keys())) interpolation = random.choice(list(self.interpolation_dict.keys()))
else: else:
...@@ -103,7 +129,13 @@ class Resize: ...@@ -103,7 +129,13 @@ class Resize:
class ResizeByLong: class ResizeByLong:
def __init__(self, long_size): """
Resize the long side of the input image to the target size.
long_size(int|list[int]): The target size of long side.
def __init__(self, long_size: Union[List[int], int]):
self.long_size = long_size self.long_size = long_size
def __call__(self, im): def __call__(self, im):
...@@ -112,14 +144,21 @@ class ResizeByLong: ...@@ -112,14 +144,21 @@ class ResizeByLong:
class ResizeRangeScaling: class ResizeRangeScaling:
def __init__(self, min_value=400, max_value=600): """
Randomly select a targeted size to resize the image according to given range.
min_value(int): The minimum value for targeted size.
max_value(int): The maximum value for targeted size.
def __init__(self, min_value: int = 400, max_value: int = 600):
if min_value > max_value: if min_value > max_value:
raise ValueError('min_value must be less than max_value, ' raise ValueError('min_value must be less than max_value, '
'but they are {} and {}.'.format(min_value, max_value)) 'but they are {} and {}.'.format(min_value, max_value))
self.min_value = min_value self.min_value = min_value
self.max_value = max_value self.max_value = max_value
def __call__(self, im): def __call__(self, im: np.ndarray):
if self.min_value == self.max_value: if self.min_value == self.max_value:
random_size = self.max_value random_size = self.max_value
else: else:
...@@ -129,7 +168,16 @@ class ResizeRangeScaling: ...@@ -129,7 +168,16 @@ class ResizeRangeScaling:
class ResizeStepScaling: class ResizeStepScaling:
def __init__(self, min_scale_factor=0.75, max_scale_factor=1.25, scale_step_size=0.25): """
Randomly select a scale factor to resize the image according to given range.
min_scale_factor(float): The minimum scale factor for targeted scale.
max_scale_factor(float): The maximum scale factor for targeted scale.
scale_step_size(float): Scale interval.
def __init__(self, min_scale_factor: float = 0.75, max_scale_factor: float = 1.25, scale_step_size: float = 0.25):
if min_scale_factor > max_scale_factor: if min_scale_factor > max_scale_factor:
raise ValueError('min_scale_factor must be less than max_scale_factor, ' raise ValueError('min_scale_factor must be less than max_scale_factor, '
'but they are {} and {}.'.format(min_scale_factor, max_scale_factor)) 'but they are {} and {}.'.format(min_scale_factor, max_scale_factor))
...@@ -137,7 +185,7 @@ class ResizeStepScaling: ...@@ -137,7 +185,7 @@ class ResizeStepScaling:
self.max_scale_factor = max_scale_factor self.max_scale_factor = max_scale_factor
self.scale_step_size = scale_step_size self.scale_step_size = scale_step_size
def __call__(self, im): def __call__(self, im: np.ndarray):
if self.min_scale_factor == self.max_scale_factor: if self.min_scale_factor == self.max_scale_factor:
scale_factor = self.min_scale_factor scale_factor = self.min_scale_factor
...@@ -157,7 +205,14 @@ class ResizeStepScaling: ...@@ -157,7 +205,14 @@ class ResizeStepScaling:
class Normalize: class Normalize:
def __init__(self, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]): """
Normalize the input image.
mean(list): Mean value for normalization.
std(list): Standard deviation for normalization.
def __init__(self, mean: list = [0.5, 0.5, 0.5], std: list = [0.5, 0.5, 0.5]):
self.mean = mean self.mean = mean
self.std = std self.std = std
if not (isinstance(self.mean, list) and isinstance(self.std, list)): if not (isinstance(self.mean, list) and isinstance(self.std, list)):
...@@ -174,7 +229,14 @@ class Normalize: ...@@ -174,7 +229,14 @@ class Normalize:
class Padding: class Padding:
def __init__(self, target_size, im_padding_value=[127.5, 127.5, 127.5]): """
Padding input into targeted size according to specific padding value.
target_size(Union[List[int], Tuple[int], int]): Targeted image size.
im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
def __init__(self, target_size: Union[List[int], Tuple[int], int], im_padding_value: list = [127.5, 127.5, 127.5]):
if isinstance(target_size, list) or isinstance(target_size, tuple): if isinstance(target_size, list) or isinstance(target_size, tuple):
if len(target_size) != 2: if len(target_size) != 2:
raise ValueError( raise ValueError(
...@@ -185,7 +247,7 @@ class Padding: ...@@ -185,7 +247,7 @@ class Padding:
self.target_size = target_size self.target_size = target_size
self.im_padding_value = im_padding_value self.im_padding_value = im_padding_value
def __call__(self, im): def __call__(self, im: np.ndarray):
im_height, im_width = im.shape[0], im.shape[1] im_height, im_width = im.shape[0], im.shape[1]
if isinstance(self.target_size, int): if isinstance(self.target_size, int):
target_height = self.target_size target_height = self.target_size
...@@ -206,6 +268,13 @@ class Padding: ...@@ -206,6 +268,13 @@ class Padding:
class RandomPaddingCrop: class RandomPaddingCrop:
Padding input image if crop size is greater than image size. Otherwise, crop the input image to given size.
crop_size(Union[List[int], Tuple[int], int]): Targeted image size.
im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
def __init__(self, crop_size, im_padding_value=[127.5, 127.5, 127.5]): def __init__(self, crop_size, im_padding_value=[127.5, 127.5, 127.5]):
if isinstance(crop_size, list) or isinstance(crop_size, tuple): if isinstance(crop_size, list) or isinstance(crop_size, tuple):
if len(crop_size) != 2: if len(crop_size) != 2:
...@@ -247,10 +316,16 @@ class RandomPaddingCrop: ...@@ -247,10 +316,16 @@ class RandomPaddingCrop:
class RandomBlur: class RandomBlur:
def __init__(self, prob=0.1): """
Random blur input image by Gaussian filter according to given probability.
prob(float): The probability to blur the image, default is 0.1.
def __init__(self, prob: float = 0.1):
self.prob = prob self.prob = prob
def __call__(self, im): def __call__(self, im: np.ndarray):
if self.prob <= 0: if self.prob <= 0:
n = 0 n = 0
elif self.prob >= 1: elif self.prob >= 1:
...@@ -270,7 +345,15 @@ class RandomBlur: ...@@ -270,7 +345,15 @@ class RandomBlur:
class RandomRotation: class RandomRotation:
def __init__(self, max_rotation=15, im_padding_value=[127.5, 127.5, 127.5]): """
Rotate the input image at random angle. The angle will not exceed to max_rotation.
max_rotation(float): Upper bound of rotation angle.
im_padding_value(list): Border value for 3 channels, default is [127.5, 127.5, 127.5].
def __init__(self, max_rotation: float = 15, im_padding_value: list = [127.5, 127.5, 127.5]):
self.max_rotation = max_rotation self.max_rotation = max_rotation
self.im_padding_value = im_padding_value self.im_padding_value = im_padding_value
...@@ -301,47 +384,32 @@ class RandomRotation: ...@@ -301,47 +384,32 @@ class RandomRotation:
return im return im
class RandomScaleAspect:
def __init__(self, min_scale=0.5, aspect_ratio=0.33):
self.min_scale = min_scale
self.aspect_ratio = aspect_ratio
def __call__(self, im):
if self.min_scale != 0 and self.aspect_ratio != 0:
img_height = im.shape[0]
img_width = im.shape[1]
for i in range(0, 10):
area = img_height * img_width
target_area = area * np.random.uniform(self.min_scale, 1.0)
aspectRatio = np.random.uniform(self.aspect_ratio, 1.0 / self.aspect_ratio)
dw = int(np.sqrt(target_area * 1.0 * aspectRatio))
dh = int(np.sqrt(target_area * 1.0 / aspectRatio))
if (np.random.randint(10) < 5):
tmp = dw
dw = dh
dh = tmp
if (dh < img_height and dw < img_width):
h1 = np.random.randint(0, img_height - dh)
w1 = np.random.randint(0, img_width - dw)
im = im[h1:(h1 + dh), w1:(w1 + dw), :]
im = cv2.resize(im, (img_width, img_height), interpolation=cv2.INTER_LINEAR)
return im class RandomDistort:
Random adjust brightness, contrast, saturation and hue according to the given random range and probability, respectively.
class RandomDistort: brightness_range(float): Boundary of brightness.
brightness_prob(float): Probability for disturb the brightness of image.
contrast_range(float): Boundary of contrast.
contrast_prob(float): Probability for disturb the contrast of image.
saturation_range(float): Boundary of saturation.
saturation_prob(float): Probability for disturb the saturation of image.
hue_range(float): Boundary of hue.
hue_prob(float): Probability for disturb the hue of image.
def __init__(self, def __init__(self,
brightness_range=0.5, brightness_range: float = 0.5,
brightness_prob=0.5, brightness_prob: float = 0.5,
contrast_range=0.5, contrast_range: float = 0.5,
contrast_prob=0.5, contrast_prob: float = 0.5,
saturation_range=0.5, saturation_range: float = 0.5,
saturation_prob=0.5, saturation_prob: float = 0.5,
hue_range=18, hue_range: float = 18,
hue_prob=0.5): hue_prob: float = 0.5):
self.brightness_range = brightness_range self.brightness_range = brightness_range
self.brightness_prob = brightness_prob self.brightness_prob = brightness_prob
self.contrast_range = contrast_range self.contrast_range = contrast_range
...@@ -351,7 +419,7 @@ class RandomDistort: ...@@ -351,7 +419,7 @@ class RandomDistort:
self.hue_range = hue_range self.hue_range = hue_range
self.hue_prob = hue_prob self.hue_prob = hue_prob
def __call__(self, im): def __call__(self, im: np.ndarray):
brightness_lower = 1 - self.brightness_range brightness_lower = 1 - self.brightness_range
brightness_upper = 1 + self.brightness_range brightness_upper = 1 + self.brightness_range
contrast_lower = 1 - self.contrast_range contrast_lower = 1 - self.contrast_range
...@@ -360,7 +428,7 @@ class RandomDistort: ...@@ -360,7 +428,7 @@ class RandomDistort:
saturation_upper = 1 + self.saturation_range saturation_upper = 1 + self.saturation_range
hue_lower = -self.hue_range hue_lower = -self.hue_range
hue_upper = self.hue_range hue_upper = self.hue_range
ops = ['brightness', 'contrast', 'saturation', 'hue'] ops = [F.brightness, F.contrast, F.saturation, F.hue]
random.shuffle(ops) random.shuffle(ops)
params_dict = { params_dict = {
'brightness': { 'brightness': {
...@@ -539,27 +607,6 @@ class LAB2RGB: ...@@ -539,27 +607,6 @@ class LAB2RGB:
return self.lab2rgb(img) return self.lab2rgb(img)
class ColorPostprocess:
Transform images from [0, 1] to [0, 255]
type(type): Type of Image value.
img(np.ndarray): Image in range of 0-255.
def __init__(self, type: type = np.uint8):
self.type = type
def __call__(self, img: np.ndarray):
img = np.transpose(img, (1, 2, 0))
img = np.clip(img, 0, 1) * 255
img = img.astype(self.type)
return img
class CenterCrop: class CenterCrop:
""" """
Crop the middle part of the image to the specified size. Crop the middle part of the image to the specified size.
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
想要评论请 注册