未验证 提交 956efd9d 编写于 作者: C CC 提交者: GitHub

【论文复现赛】InvDN (#702)

* InvDN 提交

* correct invdn_denoising

* 添加中英文文档,添加和修复一些功能

* Delete invdn.png

* Update invdn.md

* Update invdn.md

* fix bugs

* update invdn.md

* fix bugs

* fix path bugs
上级 349bcb10
......@@ -120,6 +120,7 @@ GAN-Generative Adversarial Network, was praised by "the Father of Convolutional
* [FaceEnhancement](./docs/en_US/tutorials/face_enhancement.md)
* [PReNet](./docs/en_US/tutorials/prenet.md)
* [SwinIR](./docs/en_US/tutorials/swinir.md)
* [InvDN](./docs/en_US/tutorials/invdn.md)
## Composite Application
......
......@@ -138,7 +138,7 @@ GAN--生成对抗网络,被“卷积网络之父”**Yann LeCun(杨立昆)
* 视频超分:[Video Super Resolution(VSR)](./docs/zh_CN/tutorials/video_super_resolution.md)
* 包含模型:⭐ PP-MSVSR ⭐、EDVR、BasicVSR、BasicVSR++
* 图像视频修复
* 图像去模糊去噪去雨:[MPR Net](./docs/zh_CN/tutorials/mpr_net.md)[SwinIR](./docs/zh_CN/tutorials/swinir.md)
* 图像去模糊去噪去雨:[MPR Net](./docs/zh_CN/tutorials/mpr_net.md)[SwinIR](./docs/zh_CN/tutorials/swinir.md)[InvDN](./docs/zh_CN/tutorials/invdn.md)
* 视频去模糊:[EDVR](./docs/zh_CN/tutorials/video_super_resolution.md)
* 图像去雨:[PReNet](./docs/zh_CN/tutorials/prenet.md)
......
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import argparse
sys.path.insert(0, os.getcwd())
import paddle
from ppgan.apps import InvDNPredictor
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--output_path",
type=str,
default='output_dir',
help="path to output image dir")
parser.add_argument("--weight_path",
type=str,
default=None,
help="path to model checkpoint path")
parser.add_argument("--seed",
type=int,
default=None,
help="sample random seed for model's image generation")
parser.add_argument('--images_path',
default=None,
required=True,
type=str,
help='Single image or images directory.')
parser.add_argument("--cpu",
dest="cpu",
action="store_true",
help="cpu mode.")
parser.add_argument(
"--disable_mc",
action="store_true",
help=
"Disable the Monte Carlo Self Ensemble in the paper to boost the speed during test. Performance may degrade."
)
args = parser.parse_args()
if args.cpu:
paddle.set_device('cpu')
predictor = InvDNPredictor(output_path=args.output_path,
weight_path=args.weight_path,
seed=args.seed)
predictor.run(images_path=args.images_path, disable_mc=args.disable_mc)
total_iters: 150000
output_dir: output_dir
model:
name: InvDNModel
generator:
name: InvDN
channel_in: 3
channel_out: 3
block_num: [8, 8]
scale: 4
down_num: 2
dataset:
train:
name: InvDNDataset
num_workers: 10
batch_size: 14 # 4 GPUs
opt:
phase: train
scale: 4
crop_size: 144
train_dir: data/SIDD_Medium_Srgb_Patches_512/train/
test:
name: InvDNDataset
num_workers: 1
batch_size: 1
opt:
phase: test
scale: 4
val_dir: data/SIDD_Valid_Srgb_Patches_256/valid/
export_model:
- {name: 'generator', inputs_num: 1}
lr_scheduler:
name: MultiStepDecay
learning_rate: 8e-4 # num_gpu * 2e-4
milestones: [25000, 50000, 75000, 100000, 125000, 135000, 145000]
gamma: 0.5
validate:
interval: 500
save_img: True
metrics:
psnr: # metric name, can be arbitrary
name: PSNR
crop_border: 4
test_y_channel: True
ssim:
name: SSIM
crop_border: 4
test_y_channel: True
optimizer:
name: Adam
# add parameters of net_name to optim
# name should in self.nets
net_names:
- generator
beta1: 0.9
beta2: 0.99
epsilon: 1e-8 #TODO GRADIENT_CLIPPING
log_config:
interval: 100
visiual_interval: 5000
snapshot_config:
interval: 500
English | [Chinese](../../zh_CN/tutorials/invdn.md)
# Invertible Denoising Network: A Light Solution for Real Noise Removal
**Invertible Denoising Network: A Light Solution for Real Noise Removal** (CVPR 2021)
Official code:[https://github.com/Yang-Liu1082/InvDN](https://github.com/Yang-Liu1082/InvDN)
Paper:[https://arxiv.org/abs/2104.10546](https://arxiv.org/abs/2104.10546)
## 1、Introduction
InvDN uses invertible network to divide noise image into low resolution clean image and high frequency latent representation, which contains noise information and content information. Since the invertible network is information lossless, if we can separate the noise information in the high-frequency representation, then we can reconstruct the clean picture with the original resolution together with the clean picture with the low resolution. However, it is difficult to remove the noise in the high-frequency information. In this paper, the high-frequency latent representation with noise is directly replaced by another representation sampled from the prior distribution in the process of reduction, and then the low-resolution clean image is reconstructed back to the original resolution clean image. The network implemented in this paper is lightweight.
![invdn](https://user-images.githubusercontent.com/51016595/195344773-9ea17ef5-9edd-4310-bfff-36049bbcefde.png)
## 2 How to use
### 2.1 Quick start
After installing `PaddleGAN`, you can run a command as follows to generate the restorated image `./output_dir/Denoising/image_name.png`.
```sh
python applications/tools/invdn_denoising.py --images_path ${PATH_OF_IMAGE}
```
Where `PATH_OF_IMAGE` is the path of the image you need to denoise, or the path of the folder where the images is located.
- Note that in the author's original code, Monte Carlo self-ensemble is used for testing to improve performance, but it slows things down. Users are free to choose whether to use the `--disable_mc` parameter to turn off Monte Carlo self-ensemble for faster speeds. (Monte-carlo self-ensemble is enabled by default for $test$, and disabled by default for $train$ and $valid$.)
### 2.2 Prepare dataset
#### **Train Dataset**
In this paper, we will use SIDD, including training dataset [SIDD-Medium](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php). According to the requirements of the paper, it is necessary to process the dataset into patches of $512 \times 512$. In addition, this paper needs to produce a low-resolution version of the GT image with a size of $128 \times 128$ during training. The low-resolution image is denoted as LQ.
The processed dataset can be find in [Ai Studio](https://aistudio.baidu.com/aistudio/datasetdetail/172084).
The train dataset is placed under: `data/SIDD_Medium_Srgb_Patches_512/train/`.
#### **Test Dataset**
The test dataset is [SIDD_valid](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php). The dataset downloaded from the official website is `./ValidationNoisyBlocksSrgb.mat and ./ValidationGtBlocksSrgb.mat`. You are advised to convert it to $.png$ for convenience.
The converted dataset can be find in [Ai Studio](https://aistudio.baidu.com/aistudio/datasetdetail/172069).
The test dataset is placed under:`data/SIDD_Valid_Srgb_Patches_256/valid/`.
- The file structure under the `PaddleGAN/data` folder is
```sh
data
├─ SIDD_Medium_Srgb_Patches_512
│ └─ train
│ ├─ GT
│ │ 0_0.PNG
│ │ ...
│ ├─ LQ
│ │ 0_0.PNG
│ │ ...
│ └─ Noisy
│ 0_0.PNG
│ ...
└─ SIDD_Valid_Srgb_Patches_256
└─ valid
├─ GT
│ 0_0.PNG
│ ...
└─ Noisy
0_0.PNG
...
```
### 2.3 Training
Run the following command to start training:
```sh
python -u tools/main.py --config-file configs/invdn_denoising.yaml
```
- TIPS:
In order to ensure that the total $epoch$ number is the same as in the paper configuration, we need to ensure that $total\_batchsize*iter == 1gpus*14bs*600000iters$. Also make sure that $batchsize/learning\_rate == 14/0.0002$ when $batchsize$ is changed.
For example, when using 4 GPUs, set $batchsize$ as 14, then the actual total $batchsize$ should be 14*4, and the total $iters$ needed to be set as 150,000, and the learning rate should be expanded to 8e-4.
### 2.4 Test
Run the following command to start testing:
```sh
python tools/main.py --config-file configs/invdn_denoising.yaml --evaluate-only --load ${PATH_OF_WEIGHT}
```
## 3 Results
Denoising
| model | dataset | PSNR/SSIM |
|---|---|---|
| InvDN | SIDD | 39.29 / 0.956 |
## 4 Download
| model | link |
|---|---|
| InvDN| [InvDN_Denoising](https://paddlegan.bj.bcebos.com/models/InvDN_Denoising.pdparams) |
# References
- [https://arxiv.org/abs/2104.10546](https://arxiv.org/abs/2104.10546)
```
@article{liu2021invertible,
title={Invertible Denoising Network: A Light Solution for Real Noise Removal},
author={Liu, Yang and Qin, Zhenyue and Anwar, Saeed and Ji, Pan and Kim, Dongwoo and Caldwell, Sabrina and Gedeon, Tom},
journal={arXiv preprint arXiv:2104.10546},
year={2021}
}
```
[English](../../en_US/tutorials/invdn.md) | 中文
# 可逆去噪网络(InvDN):真实噪声移除的一个轻量级方案
**Invertible Denoising Network: A Light Solution for Real Noise Removal** (CVPR 2021) 论文复现
官方源码:[https://github.com/Yang-Liu1082/InvDN](https://github.com/Yang-Liu1082/InvDN)
论文地址:[https://arxiv.org/abs/2104.10546](https://arxiv.org/abs/2104.10546)
## 1、简介
InvDN利用可逆网络把噪声图片分成低解析度干净图片和高频潜在表示, 其中高频潜在表示中含有噪声信息和内容信息。由于可逆网络是无损的, 如果我们能够将高频表示中的噪声信息分离, 那么就可以将其和低解析度干净图片一起重构成原分辨率的干净图片。但实际上去除高频信息中的噪声是很困难的, 本文通过直接将带有噪声的高频潜在表示替换为在还原过程中从先验分布中采样的另一个表示,进而结合低解析度干净图片重构回原分辨率干净图片。本文所实现网络是轻量级的, 且效果较好。
![invdn](https://user-images.githubusercontent.com/51016595/195344773-9ea17ef5-9edd-4310-bfff-36049bbcefde.png)
## 2 如何使用
### 2.1 快速体验
安装`PaddleGAN`之后进入`PaddleGAN`文件夹下,运行如下命令即生成修复后的图像`./output_dir/Denoising/image_name.png`
```sh
python applications/tools/invdn_denoising.py --images_path ${PATH_OF_IMAGE}
```
其中`PATH_OF_IMAGE`为你需要去噪的图像路径,或图像所在文件夹的路径。
- 注意,作者原代码中,测试时使用了蒙特卡洛自集成(Monte Carlo self-ensemble)以提高性能,但是会拖慢速度。用户可以自由选择是否使用 `--disable_mc` 参数来关闭蒙特卡洛自集成以提高速度。($test$ 时默认开启蒙特卡洛自集成,而 $train$ 和 $valid$ 时默认关闭蒙特卡洛自集成。)
### 2.2 数据准备
#### **训练数据**
本文所使用的数据集为SIDD,其中训练集为 [SIDD-Medium](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php)。按照论文要求,需要将数据集处理为 $512 \times 512$ 的 patches。此外,本文训练时需要产生低分辨率版本的GT图像,其尺寸为 $128 \times 128$。将低分辨率图像记作LQ。
已经处理好的数据,放在了 [Ai Studio](https://aistudio.baidu.com/aistudio/datasetdetail/172084) 里。
训练数据放在:`data/SIDD_Medium_Srgb_Patches_512/train/` 下。
#### **测试数据**
验证集为 [SIDD_valid](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php)。官网下载的验证集为 `./ValidationNoisyBlocksSrgb.mat 和 ./ValidationGtBlocksSrgb.mat`,建议转换为 $.png$ 格式更为方便。
已经转换好的数据,放在了 [Ai Studio](https://aistudio.baidu.com/aistudio/datasetdetail/172069) 里。
验证集数据放在:`data/SIDD_Valid_Srgb_Patches_256/valid/` 下。
- 经过处理之后,`PaddleGAN/data` 文件夹下的文件结构为
```sh
data
├─ SIDD_Medium_Srgb_Patches_512
│ └─ train
│ ├─ GT
│ │ 0_0.PNG
│ │ ...
│ ├─ LQ
│ │ 0_0.PNG
│ │ ...
│ └─ Noisy
│ 0_0.PNG
│ ...
└─ SIDD_Valid_Srgb_Patches_256
└─ valid
├─ GT
│ 0_0.PNG
│ ...
└─ Noisy
0_0.PNG
...
```
### 2.3 训练
运行以下命令来快速开始训练:
```sh
python -u tools/main.py --config-file configs/invdn_denoising.yaml
```
- TIPS:
在复现时,为了保证总 $epoch$ 数目和论文配置相同,我们需要确保 $ total\_batchsize*iter == 1gpus*14bs*600000iters$。同时 $batchsize$ 改变时也要确保 $batchsize/learning\_rate == 14/0.0002$ 。
例如,在使用单机四卡时,将单卡 $batchsize$ 设置为14,此时实际的总 $batchsize$ 应为14*4,需要将总 $iters$ 设置为为150000,且学习率扩大到8e-4。
### 2.4 测试
运行以下命令来快速开始测试:
```sh
python tools/main.py --config-file configs/invdn_denoising.yaml --evaluate-only --load ${PATH_OF_WEIGHT}
```
## 3 结果展示
去噪
| 模型 | 数据集 | PSNR/SSIM |
|---|---|---|
| InvDN | SIDD | 39.29 / 0.956 |
## 4 模型下载
| 模型 | 下载地址 |
|---|---|
| InvDN| [InvDN_Denoising](https://paddlegan.bj.bcebos.com/models/InvDN_Denoising.pdparams) |
# 参考文献
- [https://arxiv.org/abs/2104.10546](https://arxiv.org/abs/2104.10546)
```
@article{liu2021invertible,
title={Invertible Denoising Network: A Light Solution for Real Noise Removal},
author={Liu, Yang and Qin, Zhenyue and Anwar, Saeed and Ji, Pan and Kim, Dongwoo and Caldwell, Sabrina and Gedeon, Tom},
journal={arXiv preprint arXiv:2104.10546},
year={2021}
}
```
......@@ -38,3 +38,4 @@ from .recurrent_vsr_predictor import (PPMSVSRPredictor, BasicVSRPredictor, \
from .singan_predictor import SinGANPredictor
from .gpen_predictor import GPENPredictor
from .swinir_predictor import SwinIRPredictor
from .invdn_predictor import InvDNPredictor
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
from glob import glob
from natsort import natsorted
import numpy as np
import os
import random
from tqdm import tqdm
import paddle
from ppgan.models.generators import InvDN
from ppgan.utils.download import get_path_from_url
from .base_predictor import BasePredictor
model_cfgs = {
'Denoising': {
'model_urls':
'https://paddlegan.bj.bcebos.com/models/InvDN_Denoising.pdparams',
'channel_in': 3,
'channel_out': 3,
'block_num': [8, 8],
'scale': 4,
'down_num': 2
}
}
class InvDNPredictor(BasePredictor):
def __init__(self, output_path='output_dir', weight_path=None, seed=None):
self.output_path = output_path
task = 'Denoising'
self.task = task
if weight_path is None:
if task in model_cfgs.keys():
weight_path = get_path_from_url(model_cfgs[task]['model_urls'])
checkpoint = paddle.load(weight_path)
else:
raise ValueError('Predictor need a task to define!')
else:
if weight_path.startswith("http"): # os.path.islink dosen't work!
weight_path = get_path_from_url(weight_path)
checkpoint = paddle.load(weight_path)
else:
checkpoint = paddle.load(weight_path)
self.generator = InvDN(channel_in=model_cfgs[task]['channel_in'],
channel_out=model_cfgs[task]['channel_out'],
block_num=model_cfgs[task]['block_num'],
scale=model_cfgs[task]['scale'],
down_num=model_cfgs[task]['down_num'])
checkpoint = checkpoint['generator']
self.generator.set_state_dict(checkpoint)
self.generator.eval()
if seed is not None:
paddle.seed(seed)
random.seed(seed)
np.random.seed(seed)
def get_images(self, images_path):
if os.path.isdir(images_path):
return natsorted(
glob(os.path.join(images_path, '*.jpeg')) +
glob(os.path.join(images_path, '*.jpg')) +
glob(os.path.join(images_path, '*.JPG')) +
glob(os.path.join(images_path, '*.png')) +
glob(os.path.join(images_path, '*.PNG')))
else:
return [images_path]
def imread_uint(self, path, n_channels=3):
# input: path
# output: HxWx3(RGB or GGG), or HxWx1 (G)
if n_channels == 1:
img = cv2.imread(path, 0) # cv2.IMREAD_GRAYSCALE
img = np.expand_dims(img, axis=2) # HxWx1
elif n_channels == 3:
img = cv2.imread(path, cv2.IMREAD_UNCHANGED) # BGR or G
if img.ndim == 2:
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB) # GGG
else:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # RGB
return img
def uint2single(self, img):
return np.float32(img / 255.)
# convert single (HxWxC) to 3-dimensional paddle tensor
def single2tensor3(self, img):
return paddle.Tensor(np.ascontiguousarray(
img, dtype=np.float32)).transpose([2, 0, 1])
def forward_x8(self, x, forward_function, noise_channel):
def _transform(v, op):
v2np = v.cpu().numpy()
if op == 'v':
tfnp = v2np[:, :, :, ::-1].copy()
elif op == 'h':
tfnp = v2np[:, :, ::-1, :].copy()
elif op == 't':
tfnp = v2np.transpose((0, 1, 3, 2)).copy()
ret = paddle.to_tensor(tfnp)
return ret
noise_list = [x]
for tf in 'v', 'h', 't':
noise_list.extend([_transform(t, tf) for t in noise_list])
gaussian_list = [
paddle.randn(
(aug.shape[0], noise_channel, aug.shape[2], aug.shape[3]))
for aug in noise_list
]
sr_list = [
forward_function(aug, g_noise)[0]
for aug, g_noise in zip(noise_list, gaussian_list)
]
for i in range(len(sr_list)):
if i > 3:
sr_list[i] = _transform(sr_list[i], 't')
if i % 4 > 1:
sr_list[i] = _transform(sr_list[i], 'h')
if (i % 4) % 2 == 1:
sr_list[i] = _transform(sr_list[i], 'v')
output_cat = paddle.stack(sr_list, axis=0)
output = output_cat.mean(axis=0)
return output
def run(self, images_path=None, disable_mc=False):
os.makedirs(self.output_path, exist_ok=True)
task_path = os.path.join(self.output_path, self.task)
os.makedirs(task_path, exist_ok=True)
image_files = self.get_images(images_path)
for image_file in tqdm(image_files):
img_noisy = self.imread_uint(image_file, 3)
image_name = os.path.basename(image_file)
img = cv2.cvtColor(img_noisy, cv2.COLOR_RGB2BGR)
cv2.imwrite(os.path.join(task_path, image_name), img)
tmps = image_name.split('.')
assert len(
tmps) == 2, f'Invalid image name: {image_name}, too much "."'
restoration_save_path = os.path.join(
task_path, f'{tmps[0]}_restoration.{tmps[1]}')
img_noisy = self.uint2single(img_noisy)
# HWC to CHW, numpy to tensor
img_noisy = self.single2tensor3(img_noisy)
img_noisy = img_noisy.unsqueeze(0)
with paddle.no_grad():
# Monte Carlo Self Ensemble
noise_channel = 3 * 4**(model_cfgs['Denoising']['down_num']) - 3
if not disable_mc:
output = self.forward_x8(img_noisy, self.generator.forward,
noise_channel)
output = output[:, :3, :, :]
else:
noise = paddle.randn(
(img_noisy.shape[0], noise_channel, img_noisy.shape[2],
img_noisy.shape[3]))
output, _ = self.generator(img_noisy, noise)
output = output[:, :3, :, :]
restored = paddle.clip(output, 0, 1)
restored = restored.numpy()
restored = restored.transpose(0, 2, 3, 1)
restored = restored[0]
restored = restored * 255
restored = restored.astype(np.uint8)
cv2.imwrite(restoration_save_path,
cv2.cvtColor(restored, cv2.COLOR_RGB2BGR))
print('Done, output path is:', task_path)
......@@ -32,3 +32,4 @@ from .photopen_dataset import PhotoPenDataset
from .empty_dataset import EmptyDataset
from .gpen_dataset import GPENDataset
from .swinir_dataset import SwinIRDataset
from .invdn_dataset import InvDNDataset
# code was heavily based on https://github.com/cszn/KAIR
# MIT License
# Copyright (c) 2019 Kai Zhang
import os
import os.path as osp
import pickle
import random
import numpy as np
import cv2
import math
import paddle
from paddle.io import Dataset
from .builder import DATASETS
IMG_EXTENSIONS = [
'.jpg', '.JPG', '.jpeg', '.JPEG', '.png', '.PNG', '.ppm', '.PPM', '.bmp',
'.BMP'
]
def is_image_file(filename):
return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
def _get_paths_from_images(path):
'''get image path list from image folder'''
assert os.path.isdir(path), '{:s} is not a valid directory'.format(path)
images = []
for dirpath, _, fnames in sorted(os.walk(path)):
for fname in sorted(fnames):
if is_image_file(fname):
img_path = os.path.join(dirpath, fname)
images.append(img_path)
assert images, '{:s} has no valid image file'.format(path)
return images
def get_image_paths(data_type, dataroot):
'''get image path list'''
paths, sizes = None, None
if dataroot is not None:
if data_type == 'img':
paths = sorted(_get_paths_from_images(dataroot))
else:
raise NotImplementedError(
'data_type [{:s}] is not recognized.'.format(data_type))
return paths, sizes
def read_img(env, path, size=None):
'''read image by cv2
return: Numpy float32, HWC, BGR, [0,1]'''
if env is None: # img
#img = cv2.imread(path, cv2.IMREAD_UNCHANGED)
img = cv2.imread(path, cv2.IMREAD_COLOR)
img = img.astype(np.float32) / 255.
if img.ndim == 2:
img = np.expand_dims(img, axis=2)
# some images have 4 channels
if img.shape[2] > 3:
img = img[:, :, :3]
return img
def modcrop(img_in, scale):
# img_in: Numpy, HWC or HW
img = np.copy(img_in)
if img.ndim == 2:
H, W = img.shape
H_r, W_r = H % scale, W % scale
img = img[:H - H_r, :W - W_r]
elif img.ndim == 3:
H, W, C = img.shape
H_r, W_r = H % scale, W % scale
img = img[:H - H_r, :W - W_r, :]
else:
raise ValueError('Wrong img ndim: [{:d}].'.format(img.ndim))
return img
def bgr2ycbcr(img, only_y=True):
'''bgr version of rgb2ycbcr
only_y: only return Y channel
Input:
uint8, [0, 255]
float, [0, 1]
'''
in_img_type = img.dtype
img.astype(np.float32)
if in_img_type != np.uint8:
img *= 255.
# convert
if only_y:
rlt = np.dot(img, [24.966, 128.553, 65.481]) / 255.0 + 16.0
else:
rlt = np.matmul(img,
[[24.966, 112.0, -18.214], [128.553, -74.203, -93.786],
[65.481, -37.797, 112.0]]) / 255.0 + [16, 128, 128]
if in_img_type == np.uint8:
rlt = rlt.round()
else:
rlt /= 255.
return rlt.astype(in_img_type)
def channel_convert(in_c, tar_type, img_list):
# conversion among BGR, gray and y
if in_c == 3 and tar_type == 'gray': # BGR to gray
gray_list = [cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) for img in img_list]
return [np.expand_dims(img, axis=2) for img in gray_list]
elif in_c == 3 and tar_type == 'y': # BGR to y
y_list = [bgr2ycbcr(img, only_y=False) for img in img_list]
return y_list
# return [np.expand_dims(img, axis=2) for img in y_list]
elif in_c == 1 and tar_type == 'RGB': # gray/y to BGR
return [cv2.cvtColor(img, cv2.COLOR_GRAY2BGR) for img in img_list]
else:
return img_list
def augment(img_list, hflip=True, rot=True):
# horizontal flip OR rotate
hflip = hflip and random.random() < 0.5
vflip = rot and random.random() < 0.5
rot90 = rot and random.random() < 0.5
def _augment(img):
if isinstance(img, list):
if hflip:
img = [image[:, ::-1, :] for image in img]
if vflip:
img = [image[::-1, :, :] for image in img]
if rot90:
img = [image.transpose(1, 0, 2) for image in img]
else:
if hflip:
img = img[:, ::-1, :]
if vflip:
img = img[::-1, :, :]
if rot90:
img = img.transpose(1, 0, 2)
return img
return [_augment(img) for img in img_list]
@DATASETS.register()
class InvDNDataset(Dataset):
'''
Read LQ (Low Quality, here is LR), GT and noisy image pairs.
The pair is ensured by 'sorted' function, so please check the name convention.
'''
def __init__(self, opt=None):
super(InvDNDataset, self).__init__()
self.opt = opt
self.is_train = True if self.opt['phase'] == 'train' else False
self.paths_LQ, self.paths_GT, self.paths_Noisy = None, None, None
self.sizes_LQ, self.sizes_GT, self.sizes_Noisy = None, None, None
self.LQ_env, self.GT_env, self.Noisy_env = None, None, None
self.data_type = "img"
if self.is_train:
dataroot_gt = osp.join(opt["train_dir"], "GT")
dataroot_noisy = osp.join(opt["train_dir"], "Noisy")
dataroot_lq = osp.join(opt["train_dir"], "LQ")
else:
dataroot_gt = osp.join(opt["val_dir"], "GT")
dataroot_noisy = osp.join(opt["val_dir"], "Noisy")
dataroot_lq = None
self.paths_GT, self.sizes_GT = get_image_paths(self.data_type,
dataroot_gt)
self.paths_Noisy, self.sizes_Noisy = get_image_paths(
self.data_type, dataroot_noisy)
self.paths_LQ, self.sizes_LQ = get_image_paths(self.data_type,
dataroot_lq)
assert self.paths_GT, 'Error: GT path is empty.'
assert self.paths_Noisy, 'Error: Noisy path is empty.'
if self.paths_LQ and self.paths_GT:
assert len(self.paths_LQ) == len(
self.paths_GT
), 'GT and LQ datasets have different number of images - {}, {}.'.format(
len(self.paths_LQ), len(self.paths_GT))
self.random_scale_list = [1]
def __getitem__(self, index):
GT_path, Noisy_path, LQ_path = None, None, None
scale = self.opt["scale"]
# get GT image
GT_path = self.paths_GT[index]
resolution = None
img_GT = read_img(self.GT_env, GT_path, resolution)
# modcrop in the validation / test phase
if not self.is_train:
img_GT = modcrop(img_GT, scale)
# change color space if necessary
img_GT = channel_convert(img_GT.shape[2], "RGB", [img_GT])[0]
# get Noisy image
Noisy_path = self.paths_Noisy[index]
resolution = None
img_Noisy = read_img(self.Noisy_env, Noisy_path, resolution)
# modcrop in the validation / test phase
if not self.is_train:
img_Noisy = modcrop(img_Noisy, scale)
# change color space if necessary
img_Noisy = channel_convert(img_Noisy.shape[2], "RGB", [img_Noisy])[0]
# get LQ image
if self.paths_LQ:
LQ_path = self.paths_LQ[index]
resolution = None
img_LQ = read_img(self.LQ_env, LQ_path, resolution)
if self.is_train:
GT_size = self.opt["crop_size"]
H, W, C = img_LQ.shape
LQ_size = GT_size // scale
# randomly crop
rnd_h = random.randint(0, max(0, H - LQ_size))
rnd_w = random.randint(0, max(0, W - LQ_size))
img_LQ = img_LQ[rnd_h:rnd_h + LQ_size, rnd_w:rnd_w +
LQ_size, :] # (128, 128, 3) --> (36, 36, 3)
rnd_h_GT, rnd_w_GT = int(rnd_h * scale), int(rnd_w * scale)
img_GT = img_GT[rnd_h_GT:rnd_h_GT + GT_size, rnd_w_GT:rnd_w_GT +
GT_size, :] # (512, 512, 3) --> (144, 144, 3)
img_Noisy = img_Noisy[rnd_h_GT:rnd_h_GT + GT_size,
rnd_w_GT:rnd_w_GT + GT_size, :]
# augmentation - flip, rotate
img_LQ, img_GT, img_Noisy = augment([img_LQ, img_GT, img_Noisy],
True, True)
# change color space if necessary
C = img_LQ.shape[0]
img_LQ = channel_convert(C, "RGB", [img_LQ])[0]
# BGR to RGB, HWC to CHW, numpy to tensor
if img_GT.shape[2] == 3:
img_GT = img_GT[:, :, [2, 1, 0]]
img_Noisy = img_Noisy[:, :, [2, 1, 0]]
if self.is_train:
img_LQ = img_LQ[:, :, [2, 1, 0]]
img_GT = paddle.to_tensor(np.ascontiguousarray(
np.transpose(img_GT, (2, 0, 1))),
dtype="float32")
img_Noisy = paddle.to_tensor(np.ascontiguousarray(
np.transpose(img_Noisy, (2, 0, 1))),
dtype="float32")
if self.is_train:
img_LQ = paddle.to_tensor(np.ascontiguousarray(
np.transpose(img_LQ, (2, 0, 1))),
dtype="float32")
if self.is_train:
return img_Noisy, img_GT, img_LQ
return img_Noisy, img_GT, img_GT
def __len__(self):
return len(self.paths_GT) #32000 for train, 1280 for valid
......@@ -39,3 +39,4 @@ from .rcan_model import RCANModel
from .prenet_model import PReNetModel
from .gpen_model import GPENModel
from .swinir_model import SwinIRModel
from .invdn_model import InvDNModel
......@@ -43,3 +43,4 @@ from .rcan import RCAN
from .prenet import PReNet
from .gpen import GPEN
from .swinir import SwinIR
from .invdn import InvDN
# code was heavily based on https://github.com/Yang-Liu1082/InvDN
from itertools import repeat
import collections.abc
import math
import numpy as np
import paddle
import paddle.nn as nn
import paddle.nn.functional as F
from .builder import GENERATORS
class ResBlock(nn.Layer):
def __init__(self, channel_in, channel_out):
super(ResBlock, self).__init__()
feature = 64
weight_attr, bias_attr = self._init_weights()
self.conv1 = nn.Conv2D(channel_in,
feature,
kernel_size=3,
padding=1,
weight_attr=weight_attr,
bias_attr=bias_attr)
self.relu1 = nn.LeakyReLU(negative_slope=0.2)
self.conv2 = nn.Conv2D(feature,
feature,
kernel_size=3,
padding=1,
weight_attr=weight_attr,
bias_attr=bias_attr)
self.conv3 = nn.Conv2D((feature + channel_in),
channel_out,
kernel_size=3,
padding=1,
weight_attr=weight_attr,
bias_attr=bias_attr)
def forward(self, x):
residual = self.relu1(self.conv1(x))
residual = self.relu1(self.conv2(residual))
input = paddle.concat((x, residual), 1)
out = self.conv3(input)
return out
def _init_weights(self):
weight_attr = paddle.ParamAttr(
initializer=paddle.nn.initializer.KaimingUniform(
negative_slope=math.sqrt(5), nonlinearity='leaky_relu'))
bias_attr = paddle.ParamAttr(
initializer=paddle.nn.initializer.KaimingUniform(
negative_slope=math.sqrt(5), nonlinearity='leaky_relu'))
return weight_attr, bias_attr
class InvBlockExp(nn.Layer):
def __init__(self,
subnet_constructor,
channel_num,
channel_split_num,
clamp=1.):
super(InvBlockExp, self).__init__()
self.split_len1 = channel_split_num #3
self.split_len2 = channel_num - channel_split_num #12-3
self.clamp = clamp
self.F = subnet_constructor(self.split_len2, self.split_len1) #9->3
self.G = subnet_constructor(self.split_len1, self.split_len2) #3->9
self.H = subnet_constructor(self.split_len1, self.split_len2) #3->9
def forward(self, x, rev=False):
x1 = paddle.slice(x, [1], [0], [self.split_len1]) #low resolution img
x2 = paddle.slice(x, [1], [self.split_len1],
[self.split_len1 + self.split_len2]) #high frenquency
if not rev:
y1 = x1 + self.F(x2)
self.s = self.clamp * (F.sigmoid(self.H(y1)) * 2 - 1)
y2 = x2.multiply(paddle.exp(self.s)) + self.G(y1)
else:
self.s = self.clamp * (F.sigmoid(self.H(x1)) * 2 - 1)
y2 = (x2 - self.G(x1)).divide(paddle.exp(self.s))
y1 = x1 - self.F(y2)
return paddle.concat((y1, y2), 1)
class HaarDownsampling(nn.Layer):
def __init__(self, channel_in):
super(HaarDownsampling, self).__init__()
self.channel_in = channel_in
self.haar_weights = paddle.ones([4, 1, 2, 2])
self.haar_weights[1, 0, 0, 1] = -1
self.haar_weights[1, 0, 1, 1] = -1
self.haar_weights[2, 0, 1, 0] = -1
self.haar_weights[2, 0, 1, 1] = -1
self.haar_weights[3, 0, 1, 0] = -1
self.haar_weights[3, 0, 0, 1] = -1
self.haar_weights = paddle.concat([self.haar_weights] * self.channel_in,
0)
self.haar_weights = paddle.create_parameter(
shape=self.haar_weights.shape,
dtype=str(self.haar_weights.numpy().dtype),
default_initializer=paddle.nn.initializer.Assign(self.haar_weights))
self.haar_weights.stop_gradient = True
def forward(self, x, rev=False):
if not rev:
self.elements = x.shape[1] * x.shape[2] * x.shape[3]
out = F.conv2d(x,
self.haar_weights,
bias=None,
stride=2,
groups=self.channel_in) / 4.0
out = out.reshape([
x.shape[0], self.channel_in, 4, x.shape[2] // 2, x.shape[3] // 2
])
out = paddle.transpose(out, [0, 2, 1, 3, 4])
out = out.reshape([
x.shape[0], self.channel_in * 4, x.shape[2] // 2,
x.shape[3] // 2
])
return out
else:
self.elements = x.shape[1] * x.shape[2] * x.shape[3]
out = x.reshape(
[x.shape[0], 4, self.channel_in, x.shape[2], x.shape[3]])
out = paddle.transpose(out, [0, 2, 1, 3, 4])
out = out.reshape(
[x.shape[0], self.channel_in * 4, x.shape[2], x.shape[3]])
return F.conv2d_transpose(out,
self.haar_weights,
bias=None,
stride=2,
groups=self.channel_in)
@GENERATORS.register()
class InvDN(nn.Layer):
def __init__(self,
channel_in=3,
channel_out=3,
block_num=[8, 8],
scale=4,
down_num=2):
super(InvDN, self).__init__()
operations = []
current_channel = channel_in
subnet_constructor = constructor
self.down_num = int(math.log(scale, 2))
assert self.down_num == down_num
for i in range(self.down_num):
b = HaarDownsampling(current_channel)
operations.append(b)
current_channel *= 4
for j in range(block_num[i]):
b = InvBlockExp(subnet_constructor, current_channel,
channel_out)
operations.append(b)
self.operations = nn.LayerList(operations)
def forward(self, x, noise):
#forward
out = x
for op in self.operations:
out = op.forward(out, False)
lq = out
#backward
_, _, H, W = lq.shape
noise = noise[:, :, :H, :W]
out = paddle.concat((out[:, :3, :, :], noise), axis=1)
for op in reversed(self.operations):
out = op.forward(out, True)
return out, lq
def constructor(channel_in, channel_out):
return ResBlock(channel_in, channel_out)
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import paddle
import paddle.nn as nn
from .builder import MODELS
from .base_model import BaseModel
from .generators.builder import build_generator
from .criterions.builder import build_criterion
from ppgan.utils.visual import tensor2img
@MODELS.register()
class InvDNModel(BaseModel):
"""InvDN Model.
Invertible Denoising Network: A Light Solution for Real Noise Removal (CVPR 2021)
Originally Written by Liu, Yang and Qin, Zhenyue.
"""
def __init__(self, generator):
"""Initialize the the class.
Args:
generator (dict): config of generator.
"""
super(InvDNModel, self).__init__(generator)
self.current_iter = 1
self.nets['generator'] = build_generator(generator)
self.generator_cfg = generator
def setup_input(self, input):
self.noisy = input[0]
self.gt = input[1]
self.lq = input[2]
def train_iter(self, optims=None):
optims['optim'].clear_gradients()
noise_channel = 3 * 4**(self.generator_cfg.down_num) - 3
noise = paddle.randn((self.noisy.shape[0], noise_channel,
self.noisy.shape[2], self.noisy.shape[3]))
output_hq, output_lq = self.nets['generator'](self.noisy, noise)
output_hq = output_hq[:, :3, :, :]
output_lq = output_lq[:, :3, :, :]
self.lq = self.lq.detach()
l_forw_fit = 16.0 * paddle.mean(
paddle.sum((output_lq - self.lq)**2, (1, 2, 3)))
l_back_rec = paddle.mean(
paddle.sum(
paddle.sqrt((self.gt - output_hq) * (self.gt - output_hq) +
1e-3), (1, 2, 3)))
l_total = l_forw_fit + l_back_rec
l_total.backward()
optims['optim'].step()
self.losses['loss'] = l_total.numpy()
def forward(self):
pass
def test_iter(self, metrics=None):
self.nets['generator'].eval()
with paddle.no_grad():
noise_channel = 3 * 4**(self.generator_cfg.down_num) - 3
noise = paddle.randn((self.noisy.shape[0], noise_channel,
self.noisy.shape[2], self.noisy.shape[3]))
output_hq, _ = self.nets['generator'](self.noisy, noise)
output_hq = output_hq[:, :3, :, :]
self.output = output_hq
self.visual_items['output'] = self.output
self.nets['generator'].train()
out_img = []
gt_img = []
for out_tensor, gt_tensor in zip(self.output, self.gt):
out_img.append(tensor2img(out_tensor, (0., 1.)))
gt_img.append(tensor2img(gt_tensor, (0., 1.)))
if metrics is not None:
for metric in metrics.values():
metric.update(out_img, gt_img)
def export_model(self,
export_model=None,
output_dir=None,
inputs_size=None,
export_serving_model=False,
model_name=None):
shape = inputs_size[0]
new_model = self.nets['generator']
new_model.eval()
noise_channel = 3 * 4**(self.generator_cfg.down_num) - 3
noise_shape = (shape[0], noise_channel, shape[2], shape[3])
input_spec = [
paddle.static.InputSpec(shape=shape, dtype="float32"),
paddle.static.InputSpec(shape=noise_shape, dtype="float32")
]
static_model = paddle.jit.to_static(new_model, input_spec=input_spec)
if output_dir is None:
output_dir = 'inference_model'
if model_name is None:
model_name = '{}_{}'.format(self.__class__.__name__.lower(),
export_model[0]['name'])
paddle.jit.save(static_model, os.path.join(output_dir, model_name))
===========================train_params===========================
model_name:invdn
python:python3.7
gpu_list:0
##
auto_cast:null
total_iters:lite_train_lite_infer=10
output_dir:./output/
snapshot_config.interval:lite_train_lite_infer=10
pretrained_model:null
train_model_name:invdn*/*checkpoint.pdparams
train_infer_img_dir:null
null:null
##
trainer:norm_train
norm_train:tools/main.py -c configs/invdn_denoising.yaml --seed 100 -o log_config.interval=1
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:null
null:null
##
===========================infer_params===========================
--output_dir:./output/
load:null
norm_export:tools/export_model.py -c configs/invdn_denoising.yaml --inputs_size=1,3,256,256 --model_name inference --load
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
inference_dir:inference
train_model:./inference/invdn/invdnmodel_generator
infer_export:null
infer_quant:False
inference:tools/inference.py --model_type invdn --seed 100 -c configs/invdn_denoising.yaml --output_path test_tipc/output/
--device:gpu
null:null
null:null
null:null
null:null
null:null
--model_path:
null:null
null:null
--benchmark:True
null:null
......@@ -66,6 +66,10 @@ if [ ${MODE} = "lite_train_lite_infer" ];then
rm -rf ./data/*sets
wget -nc -P ./data/ https://paddlegan.bj.bcebos.com/datasets/swinir_data.zip --no-check-certificate
cd ./data/ && unzip -q swinir_data.zip && cd ../ ;;
invdn)
rm -rf ./data/SIDD_*
wget -nc -P ./data/ https://paddlegan.bj.bcebos.com/datasets/SIDD_mini.zip --no-check-certificate
cd ./data/ && unzip -q SIDD_mini.zip && cd ../ ;;
singan)
rm -rf ./data/singan*
wget -nc -P ./data/ https://paddlegan.bj.bcebos.com/datasets/singan-official_images.zip --no-check-certificate
......
......@@ -19,7 +19,7 @@ from ppgan.metrics import build_metric
MODEL_CLASSES = ["pix2pix", "cyclegan", "wav2lip", "esrgan", \
"edvr", "fom", "stylegan2", "basicvsr", "msvsr", "singan", "swinir"]
"edvr", "fom", "stylegan2", "basicvsr", "msvsr", "singan", "swinir", "invdn"]
def parse_args():
......@@ -391,6 +391,29 @@ def main():
file_name = os.path.join(args.output_path, model_type,
"{}.png".format(i))
cv2.imwrite(file_name, sample)
elif model_type == "invdn":
noisy = data[0].numpy()
noise_channel = 3 * 4**(cfg.model.generator.down_num) - 3
input_handles[0].copy_from_cpu(noisy)
input_handles[1].copy_from_cpu(
np.random.randn(noisy.shape[0], noise_channel, noisy.shape[2],
noisy.shape[3]).astype(np.float32))
predictor.run()
output_handles = [
predictor.get_output_handle(name)
for name in predictor.get_output_names()
]
prediction = output_handles[0].copy_to_cpu()
prediction = paddle.to_tensor(prediction[0])
image_numpy = tensor2img(prediction, min_max)
gt_numpy = tensor2img(data[1], min_max)
save_image(image_numpy,
os.path.join(args.output_path, "invdn/{}.png".format(i)))
metric_file = os.path.join(args.output_path, model_type,
"metric.txt")
for metric in metrics.values():
metric.update(image_numpy, gt_numpy)
break
if metrics:
log_file = open(metric_file, 'a')
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册