...
 
Commits (18)
    https://gitcode.net/hypo/DeepMosaics/-/commit/6ca84044588f7bfc1d49a3c1a51d34fcb1b4f990 Fix bug #28 2020-09-11T15:29:15+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/cab98f71e5ab7828d3450eeb73c5cf9318deecdd 1. Allow unfinished tasks to be restored (#38 #33) 2020-12-17T10:00:57+08:00 hypox64 hypox128@gmail.com 2. Optimize image_pool when processing video https://gitcode.net/hypo/DeepMosaics/-/commit/e31088f5977054f41ff73b1f474bd04c6c3a9eed Clean cache during processing 2020-12-19T14:06:01+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/87f33a504e91e266cc3924dc218c163a4c8bd3a3 English grammar and header capitalization consistency fixes 2021-03-19T04:18:08+00:00 john 18413674+grravity@users.noreply.github.com https://gitcode.net/hypo/DeepMosaics/-/commit/8ad64b5112dbd308a5c454f46288d2b361d2fb45 Merge pull request #48 from grravity/master 2021-04-13T21:51:28+08:00 Hypo hypox128@gmail.com English grammar and header capitalization consistency fixes https://gitcode.net/hypo/DeepMosaics/-/commit/538ccbcdab757e061f8643ea7acb9fddf974e7dc Check the environment before running 2021-04-18T12:52:38+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/972dcc875563f94628478a09b60396d4adc1ad76 BVDNet 2021-04-18T12:53:43+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/8f4e9158d1b0ded134607e4f3f6ec68330788928 Fix frame leak 2021-04-18T21:41:42+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/2bbda3510a4e315d96d6d2e1c83944f470be6b27 BVDNet SpectralNorm 2021-04-19T10:21:34+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/4c6b29b42ff8f1609a4cd1f93f0b056f6b26fc63 Optimize the ffmpeg command, modify use_gpu to gpu_id 2021-04-20T14:31:58+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/48c032b16e0804a67cae6eadcc822b0fdb39f3ae Readly to add gan 2021-04-22T16:07:02+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/796b59d036057fe8b45a92635ce04b22da52a7c6 sngan 2021-04-22T22:15:33+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/1749be92f3f360e328d7b2945d56b65d3a7f943a Gan code finished! 2021-04-24T14:24:20+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/65f48925ba0fc5cdcfb83172f2ef818b82083927 Completed the video core and faster feather 2021-04-25T19:11:56+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/e100118e1db2386109421b06a3503914e6764114 Updata README.md and fix some bugs 2021-05-10T12:55:17+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/e52e6c8f9d2b0f26e736a5e11c993225ee46954e Merge branch 'dev' 2021-05-10T12:57:51+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/ef9b2e4ce5913e895bd1e67759427c9a02e985e7 V0.5.1 (#19 #33 #53) 2021-05-23T00:49:41+08:00 hypox64 hypox128@gmail.com https://gitcode.net/hypo/DeepMosaics/-/commit/7b9afd4396bd20364b9910dccd1ae9cd5288de60 Merge branch 'dev' 2021-05-23T00:51:19+08:00 hypox64 hypox128@gmail.com
......@@ -142,6 +142,8 @@ test*
video_tmp/
result/
nohup.out
.vscode/
#./
/pix2pix
/pix2pixHD
......@@ -156,6 +158,7 @@ nohup.out
/deepmosaic_window
/sftp-config.json
/exe
#./make_datasets
/make_datasets/video
/make_datasets/tmp
......@@ -171,6 +174,7 @@ nohup.out
#mediafile
*iter
*.pth
*.pt
*.jpeg
*.bmp
*.mp4
......@@ -183,4 +187,40 @@ nohup.out
*.MP4
*.JPEG
*.exe
*.npy
\ No newline at end of file
*.npy
*.psd
##############################cpp###################################
# Prerequisites
*.d
# Compiled Object files
*.slo
*.lo
*.o
*.obj
# Precompiled Headers
*.gch
*.pch
# Compiled Dynamic libraries
*.so
*.dylib
*.dll
# Fortran module files
*.mod
*.smod
# Compiled Static libraries
*.lai
*.la
*.a
*.lib
# Executables
*.exe
*.out
*.app
\ No newline at end of file
![image](./imgs/hand.gif)
# <img src="./imgs/icon.jpg" width="48">DeepMosaics
You can use it to automatically remove the mosaics in images and videos, or add mosaics to them.<br>
This porject based on "semantic segmentation" and "Image-to-Image Translation".<br>
<div align="center">
<img src="./imgs/logo.png" width="250"><br><br>
<img src="https://badgen.net/github/stars/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<img src="https://badgen.net/github/forks/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/downloads/hypox64/deepmosaics/total></a>&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/v/release/hypox64/DeepMosaics></a>&emsp;<img src=https://img.shields.io/github/license/hypox64/deepmosaics>
</div>
* [中文版README](./README_CN.md)<br>
# DeepMosaics
**English | [中文](./README_CN.md)**<br>
You can use it to automatically remove the mosaics in images and videos, or add mosaics to them.<br>This project is based on "semantic segmentation" and "Image-to-Image Translation".<br>Try it at this [website](http://118.89.27.46:5000/)!<br>
### More example
### Examples
![image](./imgs/hand.gif)
origin | auto add mosaic | auto clean mosaic
:-:|:-:|:-:
......@@ -28,24 +31,27 @@ origin | to Van Gogh | to winter
An interesting example:[Ricardo Milos to cat](https://www.bilibili.com/video/BV1Q7411W7n6)
## Run DeepMosaics
You can either run DeepMosaics via pre-built binary package or from source.<br>
You can either run DeepMosaics via a pre-built binary package, or from source.<br>
### Try it on web
You can simply try to remove the mosaic on the **face** at this [website](http://118.89.27.46:5000/).<br>
### Pre-built binary package
For windows, we bulid a GUI version for easy test.<br>
Download this version and pre-trained model via [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
For Windows, we bulid a GUI version for easy testing.<br>
Download this version, and a pre-trained model via [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
* [[How to use]](./docs/exe_help.md)<br>
* [[Help document]](./docs/exe_help.md)<br>
* Video tutorial => [[youtube]](https://www.youtube.com/watch?v=1kEmYawJ_vk) [[bilibili]](https://www.bilibili.com/video/BV1QK4y1a7Av)
![image](./imgs/GUI.png)<br>
Attentions:<br>
- Require Windows_x86_64, Windows10 is better.<br>
- Requires Windows_x86_64, Windows10 is better.<br>
- Different pre-trained models are suitable for different effects.[[Introduction to pre-trained models]](./docs/pre-trained_models_introduction.md)<br>
- Run time depends on computer performance(The current version does not support gpu, if you need to use gpu please run source).<br>
- Run time depends on computers performance (GPU version has better performance but requires CUDA to be installed).<br>
- If output video cannot be played, you can try with [potplayer](https://daumpotplayer.com/download/).<br>
- GUI version update slower than source.<br>
- GUI version updates slower than source.<br>
### Run from source
### Run From Source
#### Prerequisites
- Linux, Mac OS, Windows
- Python 3.6+
......@@ -56,30 +62,29 @@ Attentions:<br>
This code depends on opencv-python, torchvision available via pip install.
#### Clone this repo
```bash
git clone https://github.com/HypoX64/DeepMosaics
git clone https://github.com/HypoX64/DeepMosaics.git
cd DeepMosaics
```
#### Get pre-trained models
#### Get Pre-Trained Models
You can download pre_trained models and put them into './pretrained_models'.<br>
[[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ)<br>
[[Introduction to pre-trained models]](./docs/pre-trained_models_introduction.md)<br>
#### Simple example
#### Simple Example
* Add Mosaic (output media will save in './result')<br>
```bash
python3 deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --use_gpu 0
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --gpu_id 0
```
* Clean Mosaic (output media will save in './result')<br>
```bash
python3 deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --use_gpu 0
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --gpu_id 0
```
#### More parameters
If you want to test other image or video, please refer to this file.<br>
#### More Parameters
If you want to test other images or videos, please refer to this file.<br>
[[options_introduction.md]](./docs/options_introduction.md) <br>
## Training with your own dataset
## Training With Your Own Dataset
If you want to train with your own dataset, please refer to [training_with_your_own_dataset.md](./docs/training_with_your_own_dataset.md)
## Acknowledgments
This code borrows heavily from [[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet).
## Acknowledgements
This code borrows heavily from [[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet) [[DFDNet]](https://github.com/csxmli2016/DFDNet) [[GFRNet_pytorch_new]](https://github.com/sonack/GFRNet_pytorch_new).
![image](./imgs/hand.gif)
# <img src="./imgs/icon.jpg" width="48">DeepMosaics
这是一个通过深度学习自动的为图片/视频添加马赛克,或消除马赛克的项目.<br>它基于“语义分割”以及“图像翻译”.<br>
<div align="center">
<img src="./imgs/logo.png" width="250"><br><br>
<img src="https://badgen.net/github/stars/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<img src="https://badgen.net/github/forks/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/downloads/hypox64/deepmosaics/total></a>&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/v/release/hypox64/DeepMosaics></a>&emsp;<img src=https://img.shields.io/github/license/hypox64/deepmosaics>
</div>
# DeepMosaics
**[English](./README.md) | 中文**<br>
这是一个通过深度学习自动的为图片/视频添加马赛克,或消除马赛克的项目.<br>它基于“语义分割”以及“图像翻译”.<br>现在可以在这个[网站](http://118.89.27.46:5000/)尝试使用该项目清除马赛克!<br>
### 更多例子
### 例子
![image](./imgs/hand.gif)
原始 | 自动打码 | 自动去码
:-:|:-:|:-:
......@@ -26,19 +33,20 @@
## 如何运行
可以通过我们预编译好的二进制包或源代码运行.<br>
### 在网页中运行
打开[这个网站](http://118.89.27.46:5000/)上传照片,将获得去除马赛克后的结果,受限于当地法律,**目前只支持人脸**.<br>
### 预编译的程序包
对于Windows用户,我们提供了包含GUI界面的免安装软件包.<br>
可以通过下面两种方式进行下载: [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
* [[使用教程]](./docs/exe_help_CN.md)<br>
* [[帮助文档]](./docs/exe_help_CN.md)<br>
* [[视频教程]](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
![image](./imgs/GUI.png)<br>
注意事项:<br>
- 程序的运行要求在64位Windows操作系统,我仅在Windows10运行过,其他版本暂未经过测试<br>
- 请根据需求选择合适的预训练模型进行测试,不同的预期训练模型具有不同的效果.[[预训练模型介绍]](./docs/pre-trained_models_introduction_CN.md)<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用源码并在GPU上运行.<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议在GPU上运行.<br>
- 如果输出的视频无法播放,这边建议您尝试[potplayer](https://daumpotplayer.com/download/).<br>
- 相比于源码,该版本的更新将会延后.
......@@ -50,10 +58,10 @@
- [Pytorch 1.0+](https://pytorch.org/)
- CPU or NVIDIA GPU + CUDA CuDNN<br>
#### Python依赖项
代码依赖于opencv-python以及 torchvision,可通过pip install 进行安装.
代码依赖于opencv-python以及 torchvision,可通过pip install 进行安装.
#### 克隆源代码
```bash
git clone https://github.com/HypoX64/DeepMosaics
git clone https://github.com/HypoX64/DeepMosaics.git
cd DeepMosaics
```
#### 下载预训练模型
......@@ -62,13 +70,13 @@ cd DeepMosaics
[[预训练模型介绍]](./docs/pre-trained_models_introduction_CN.md)<br>
#### 简单的例子
* 为视频添加马赛克,例子中认为脸是需要打码的区域 ,可以通过切换预训练模型切换自动打码区域(输出结果将储存到 './result')<br>
* 为视频或照片添加马赛克,例子中认为脸是需要打码的区域 ,可以通过切换预训练模型切换自动打码区域(输出结果将储存到 './result')<br>
```bash
python3 deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --use_gpu 0
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --gpu_id 0
```
* 将视频中的马赛克移除,对于不同的打码物体需要使用对应的预训练模型进行马赛克消除(输出结果将储存到 './result')<br>
* 将视频或照片中的马赛克移除,对于不同的打码物体需要使用对应的预训练模型进行马赛克消除(输出结果将储存到 './result')<br>
```bash
python3 deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --use_gpu 0
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --gpu_id 0
```
#### 更多的参数
如果想要测试其他的图片或视频,请参照以下文件输入参数.<br>
......@@ -78,5 +86,5 @@ python3 deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretra
如果需要使用自己的数据训练模型,请参照 [training_with_your_own_dataset.md](./docs/training_with_your_own_dataset.md)
## 鸣谢
代码大量的参考了以下项目:[[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet).
代码大量的参考了以下项目:[[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet) [[DFDNet]](https://github.com/csxmli2016/DFDNet) [[GFRNet_pytorch_new]](https://github.com/sonack/GFRNet_pytorch_new).
import os
from queue import Queue
from threading import Thread
import time
import numpy as np
import cv2
from models import runmodel
from util import mosaic,util,ffmpeg,filt
from util import image_processing as impro
from .init import video_init
'''
---------------------Add Mosaic---------------------
'''
def addmosaic_img(opt,netS):
path = opt.media_path
print('Add Mosaic:',path)
img = impro.imread(path)
mask = runmodel.get_ROI_position(img,netS,opt)[0]
img = mosaic.addmosaic(img,mask,opt)
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_add.jpg'),img)
def get_roi_positions(opt,netS,imagepaths,savemask=True):
# resume
continue_flag = False
if os.path.isfile(os.path.join(opt.temp_dir,'step.json')):
step = util.loadjson(os.path.join(opt.temp_dir,'step.json'))
resume_frame = int(step['frame'])
if int(step['step'])>2:
mask_index = np.load(os.path.join(opt.temp_dir,'mask_index.npy'))
return mask_index
if int(step['step'])>=2 and resume_frame>0:
pre_positions = np.load(os.path.join(opt.temp_dir,'roi_positions.npy'))
continue_flag = True
imagepaths = imagepaths[resume_frame:]
positions = []
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('mask', cv2.WINDOW_NORMAL)
print('Step:2/4 -- Find mosaic location')
img_read_pool = Queue(4)
def loader(imagepaths):
for imagepath in imagepaths:
img_origin = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
img_read_pool.put(img_origin)
t = Thread(target=loader,args=(imagepaths,))
t.daemon = True
t.start()
for i,imagepath in enumerate(imagepaths,1):
img_origin = img_read_pool.get()
mask,x,y,size,area = runmodel.get_ROI_position(img_origin,netS,opt)
positions.append([x,y,area])
if savemask:
t = Thread(target=cv2.imwrite,args=(os.path.join(opt.temp_dir+'/ROI_mask',imagepath), mask,))
t.start()
if i%1000==0:
save_positions = np.array(positions)
if continue_flag:
save_positions = np.concatenate((pre_positions,save_positions),axis=0)
np.save(os.path.join(opt.temp_dir,'roi_positions.npy'),save_positions)
step = {'step':2,'frame':i+resume_frame}
util.savejson(os.path.join(opt.temp_dir,'step.json'),step)
#preview result and print
if not opt.no_preview:
cv2.imshow('mask',mask)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(len(imagepaths)),util.get_bar(100*i/len(imagepaths),num=35),util.counttime(t1,t2,i,len(imagepaths)),end='')
if not opt.no_preview:
cv2.destroyAllWindows()
print('\nOptimize ROI locations...')
if continue_flag:
positions = np.concatenate((pre_positions,positions),axis=0)
mask_index = filt.position_medfilt(np.array(positions), 7)
step = {'step':3,'frame':0}
util.savejson(os.path.join(opt.temp_dir,'step.json'),step)
np.save(os.path.join(opt.temp_dir,'roi_positions.npy'),positions)
np.save(os.path.join(opt.temp_dir,'mask_index.npy'),np.array(mask_index))
return mask_index
def addmosaic_video(opt,netS):
path = opt.media_path
fps,imagepaths = video_init(opt,path)[:2]
length = len(imagepaths)
start_frame = int(imagepaths[0][7:13])
mask_index = get_roi_positions(opt,netS,imagepaths)[(start_frame-1):]
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
# add mosaic
print('Step:3/4 -- Add Mosaic:')
t1 = time.time()
# print(mask_index)
for i,imagepath in enumerate(imagepaths,1):
mask = impro.imread(os.path.join(opt.temp_dir+'/ROI_mask',imagepaths[np.clip(mask_index[i-1]-start_frame,0,1000000)]),'gray')
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
if impro.mask_area(mask)>100:
try:#Avoid unknown errors
img = mosaic.addmosaic(img, mask, opt)
except Exception as e:
print('Warning:',e)
t = Thread(target=cv2.imwrite,args=(os.path.join(opt.temp_dir+'/addmosaic_image',imagepath),img))
t.start()
os.remove(os.path.join(opt.temp_dir+'/video2image',imagepath))
#preview result and print
if not opt.no_preview:
cv2.imshow('preview',img)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i,length),end='')
print()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/addmosaic_image/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_add.mp4'))
\ No newline at end of file
......@@ -2,150 +2,61 @@ import os
import time
import numpy as np
import cv2
from models import runmodel,loadmodel
from util import mosaic,util,ffmpeg,filt,data
import torch
from models import runmodel
from util import data,util,ffmpeg,filt
from util import image_processing as impro
'''
---------------------Video Init---------------------
'''
def video_init(opt,path):
util.clean_tempfiles(opt)
fps,endtime,height,width = ffmpeg.get_video_infos(path)
if opt.fps !=0:
fps = opt.fps
ffmpeg.video2voice(path,opt.temp_dir+'/voice_tmp.mp3',opt.start_time,opt.last_time)
ffmpeg.video2image(path,opt.temp_dir+'/video2image/output_%06d.'+opt.tempimage_type,fps,opt.start_time,opt.last_time)
imagepaths=os.listdir(opt.temp_dir+'/video2image')
imagepaths.sort()
return fps,imagepaths,height,width
'''
---------------------Add Mosaic---------------------
'''
def addmosaic_img(opt,netS):
path = opt.media_path
print('Add Mosaic:',path)
img = impro.imread(path)
mask = runmodel.get_ROI_position(img,netS,opt)[0]
img = mosaic.addmosaic(img,mask,opt)
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_add.jpg'),img)
def addmosaic_video(opt,netS):
path = opt.media_path
fps,imagepaths = video_init(opt,path)[:2]
length = len(imagepaths)
# get position
positions = []
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
print('Find ROI location:')
for i,imagepath in enumerate(imagepaths,1):
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
mask,x,y,size,area = runmodel.get_ROI_position(img,netS,opt)
positions.append([x,y,area])
cv2.imwrite(os.path.join(opt.temp_dir+'/ROI_mask',imagepath),mask)
#preview result and print
if not opt.no_preview:
cv2.imshow('preview',mask)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i,length),end='')
print('\nOptimize ROI locations...')
mask_index = filt.position_medfilt(np.array(positions), 7)
# add mosaic
print('Add Mosaic:')
t1 = time.time()
for i,imagepath in enumerate(imagepaths,1):
mask = impro.imread(os.path.join(opt.temp_dir+'/ROI_mask',imagepaths[mask_index[i-1]]),'gray')
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
if impro.mask_area(mask)>100:
try:#Avoid unknown errors
img = mosaic.addmosaic(img, mask, opt)
except Exception as e:
print('Warning:',e)
cv2.imwrite(os.path.join(opt.temp_dir+'/addmosaic_image',imagepath),img)
#preview result and print
if not opt.no_preview:
cv2.imshow('preview',img)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i,length),end='')
print()
if not opt.no_preview:
cv2.destroyAllWindows()
ffmpeg.image2video( fps,
opt.temp_dir+'/addmosaic_image/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_add.mp4'))
'''
---------------------Style Transfer---------------------
'''
def styletransfer_img(opt,netG):
print('Style Transfer_img:',opt.media_path)
img = impro.imread(opt.media_path)
img = runmodel.run_styletransfer(opt, netG, img)
suffix = os.path.basename(opt.model_path).replace('.pth','').replace('style_','')
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(opt.media_path))[0]+'_'+suffix+'.jpg'),img)
def styletransfer_video(opt,netG):
path = opt.media_path
positions = []
fps,imagepaths = video_init(opt,path)[:2]
print('Transfer:')
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
length = len(imagepaths)
for i,imagepath in enumerate(imagepaths,1):
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
img = runmodel.run_styletransfer(opt, netG, img)
cv2.imwrite(os.path.join(opt.temp_dir+'/style_transfer',imagepath),img)
#preview result and print
if not opt.no_preview:
cv2.imshow('preview',img)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i,len(imagepaths)),end='')
print()
if not opt.no_preview:
cv2.destroyAllWindows()
suffix = os.path.basename(opt.model_path).replace('.pth','').replace('style_','')
ffmpeg.image2video( fps,
opt.temp_dir+'/style_transfer/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_'+suffix+'.mp4'))
from .init import video_init
from multiprocessing import Queue, Process
from threading import Thread
'''
---------------------Clean Mosaic---------------------
'''
def get_mosaic_positions(opt,netM,imagepaths,savemask=True):
# get mosaic position
# resume
continue_flag = False
if os.path.isfile(os.path.join(opt.temp_dir,'step.json')):
step = util.loadjson(os.path.join(opt.temp_dir,'step.json'))
resume_frame = int(step['frame'])
if int(step['step'])>2:
pre_positions = np.load(os.path.join(opt.temp_dir,'mosaic_positions.npy'))
return pre_positions
if int(step['step'])>=2 and resume_frame>0:
pre_positions = np.load(os.path.join(opt.temp_dir,'mosaic_positions.npy'))
continue_flag = True
imagepaths = imagepaths[resume_frame:]
positions = []
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('mosaic mask', cv2.WINDOW_NORMAL)
print('Step:2/4 -- Find mosaic location')
img_read_pool = Queue(4)
def loader(imagepaths):
for imagepath in imagepaths:
img_origin = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
img_read_pool.put(img_origin)
t = Thread(target=loader,args=(imagepaths,))
t.setDaemon(True)
t.start()
print('Find mosaic location:')
for i,imagepath in enumerate(imagepaths,1):
img_origin = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
img_origin = img_read_pool.get()
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
positions.append([x,y,size])
if savemask:
cv2.imwrite(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath), mask)
t = Thread(target=cv2.imwrite,args=(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath), mask,))
t.start()
if i%1000==0:
save_positions = np.array(positions)
if continue_flag:
save_positions = np.concatenate((pre_positions,save_positions),axis=0)
np.save(os.path.join(opt.temp_dir,'mosaic_positions.npy'),save_positions)
step = {'step':2,'frame':i+resume_frame}
util.savejson(os.path.join(opt.temp_dir,'step.json'),step)
#preview result and print
if not opt.no_preview:
cv2.imshow('mosaic mask',mask)
......@@ -157,7 +68,12 @@ def get_mosaic_positions(opt,netM,imagepaths,savemask=True):
cv2.destroyAllWindows()
print('\nOptimize mosaic locations...')
positions =np.array(positions)
if continue_flag:
positions = np.concatenate((pre_positions,positions),axis=0)
for i in range(3):positions[:,i] = filt.medfilt(positions[:,i],opt.medfilt_num)
step = {'step':3,'frame':0}
util.savejson(os.path.join(opt.temp_dir,'step.json'),step)
np.save(os.path.join(opt.temp_dir,'mosaic_positions.npy'),positions)
return positions
......@@ -167,7 +83,7 @@ def cleanmosaic_img(opt,netG,netM):
print('Clean Mosaic:',path)
img_origin = impro.imread(path)
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
cv2.imwrite('./mask/'+os.path.basename(path), mask)
#cv2.imwrite('./mask/'+os.path.basename(path), mask)
img_result = img_origin.copy()
if size > 100 :
img_mosaic = img_origin[y-size:y+size,x-size:x+size]
......@@ -180,16 +96,30 @@ def cleanmosaic_img(opt,netG,netM):
print('Do not find mosaic')
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_clean.jpg'),img_result)
def cleanmosaic_img_server(opt,img_origin,netG,netM):
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
img_result = img_origin.copy()
if size > 100 :
img_mosaic = img_origin[y-size:y+size,x-size:x+size]
if opt.traditional:
img_fake = runmodel.traditional_cleaner(img_mosaic,opt)
else:
img_fake = runmodel.run_pix2pix(img_mosaic,netG,opt)
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
return img_result
def cleanmosaic_video_byframe(opt,netG,netM):
path = opt.media_path
fps,imagepaths = video_init(opt,path)[:2]
positions = get_mosaic_positions(opt,netM,imagepaths,savemask=True)
fps,imagepaths,height,width = video_init(opt,path)
start_frame = int(imagepaths[0][7:13])
positions = get_mosaic_positions(opt,netM,imagepaths,savemask=True)[(start_frame-1):]
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('clean', cv2.WINDOW_NORMAL)
# clean mosaic
print('Clean Mosaic:')
print('Step:3/4 -- Clean Mosaic:')
length = len(imagepaths)
for i,imagepath in enumerate(imagepaths,0):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
......@@ -206,7 +136,9 @@ def cleanmosaic_video_byframe(opt,netG,netM):
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
except Exception as e:
print('Warning:',e)
cv2.imwrite(os.path.join(opt.temp_dir+'/replace_mosaic',imagepath),img_result)
t = Thread(target=cv2.imwrite,args=(os.path.join(opt.temp_dir+'/replace_mosaic',imagepath), img_result,))
t.start()
os.remove(os.path.join(opt.temp_dir+'/video2image',imagepath))
#preview result and print
if not opt.no_preview:
......@@ -217,6 +149,7 @@ def cleanmosaic_video_byframe(opt,netG,netM):
print()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/replace_mosaic/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
......@@ -224,62 +157,93 @@ def cleanmosaic_video_byframe(opt,netG,netM):
def cleanmosaic_video_fusion(opt,netG,netM):
path = opt.media_path
N = 25
if 'HD' in os.path.basename(opt.model_path):
INPUT_SIZE = 256
else:
INPUT_SIZE = 128
N,T,S = 2,5,3
LEFT_FRAME = (N*S)
POOL_NUM = LEFT_FRAME*2+1
INPUT_SIZE = 256
FRAME_POS = np.linspace(0, (T-1)*S,T,dtype=np.int64)
img_pool = []
previous_frame = None
init_flag = True
fps,imagepaths,height,width = video_init(opt,path)
positions = get_mosaic_positions(opt,netM,imagepaths,savemask=True)
start_frame = int(imagepaths[0][7:13])
positions = get_mosaic_positions(opt,netM,imagepaths,savemask=True)[(start_frame-1):]
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('clean', cv2.WINDOW_NORMAL)
# clean mosaic
print('Clean Mosaic:')
print('Step:3/4 -- Clean Mosaic:')
length = len(imagepaths)
img_pool = np.zeros((height,width,3*N), dtype='uint8')
mosaic_input = np.zeros((INPUT_SIZE,INPUT_SIZE,3*N+1), dtype='uint8')
write_pool = Queue(4)
show_pool = Queue(4)
def write_result():
while True:
save_ori,imagepath,img_origin,img_fake,x,y,size = write_pool.get()
if save_ori:
img_result = img_origin
else:
mask = cv2.imread(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath),0)
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
if not opt.no_preview:
show_pool.put(img_result.copy())
cv2.imwrite(os.path.join(opt.temp_dir+'/replace_mosaic',imagepath),img_result)
os.remove(os.path.join(opt.temp_dir+'/video2image',imagepath))
t = Thread(target=write_result,args=())
t.setDaemon(True)
t.start()
for i,imagepath in enumerate(imagepaths,0):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
input_stream = []
# image read stream
mask = cv2.imread(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath),0)
if i==0 :
for j in range(0,N):
img_pool[:,:,j*3:(j+1)*3] = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+j-12,0,len(imagepaths)-1)]))
else:
img_pool[:,:,0:(N-1)*3] = img_pool[:,:,3:N*3]
img_pool[:,:,(N-1)*3:] = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+12,0,len(imagepaths)-1)]))
img_origin = img_pool[:,:,int((N-1)/2)*3:(int((N-1)/2)+1)*3]
img_result = img_origin.copy()
if i==0 :# init
for j in range(POOL_NUM):
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+j-LEFT_FRAME,0,len(imagepaths)-1)])))
else: # load next frame
img_pool.pop(0)
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+LEFT_FRAME,0,len(imagepaths)-1)])))
img_origin = img_pool[LEFT_FRAME]
# preview result and print
if not opt.no_preview:
if show_pool.qsize()>3:
cv2.imshow('clean',show_pool.get())
cv2.waitKey(1) & 0xFF
if size>100:
if size>50:
try:#Avoid unknown errors
#reshape to network input shape
for pos in FRAME_POS:
input_stream.append(impro.resize(img_pool[pos][y-size:y+size,x-size:x+size], INPUT_SIZE,interpolation=cv2.INTER_CUBIC)[:,:,::-1])
if init_flag:
init_flag = False
previous_frame = input_stream[N]
previous_frame = data.im2tensor(previous_frame,bgr2rgb=True,gpu_id=opt.gpu_id)
mosaic_input[:,:,0:N*3] = impro.resize(img_pool[y-size:y+size,x-size:x+size,:], INPUT_SIZE)
mask_input = impro.resize(mask,np.min(img_origin.shape[:2]))[y-size:y+size,x-size:x+size]
mosaic_input[:,:,-1] = impro.resize(mask_input, INPUT_SIZE)
mosaic_input_tensor = data.im2tensor(mosaic_input,bgr2rgb=False,use_gpu=opt.use_gpu,use_transform = False,is0_1 = False)
unmosaic_pred = netG(mosaic_input_tensor)
img_fake = data.tensor2im(unmosaic_pred,rgb2bgr = False ,is0_1 = False)
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
input_stream = np.array(input_stream).reshape(1,T,INPUT_SIZE,INPUT_SIZE,3).transpose((0,4,1,2,3))
input_stream = data.to_tensor(data.normalize(input_stream),gpu_id=opt.gpu_id)
with torch.no_grad():
unmosaic_pred = netG(input_stream,previous_frame)
img_fake = data.tensor2im(unmosaic_pred,rgb2bgr = True)
previous_frame = unmosaic_pred
write_pool.put([False,imagepath,img_origin.copy(),img_fake.copy(),x,y,size])
except Exception as e:
print('Warning:',e)
cv2.imwrite(os.path.join(opt.temp_dir+'/replace_mosaic',imagepath),img_result)
init_flag = True
print('Error:',e)
else:
write_pool.put([True,imagepath,img_origin.copy(),-1,-1,-1,-1])
init_flag = True
#preview result and print
if not opt.no_preview:
cv2.imshow('clean',img_result)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i+1)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i+1,len(imagepaths)),end='')
print()
write_pool.close()
show_pool.close()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/replace_mosaic/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_clean.mp4'))
\ No newline at end of file
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_clean.mp4'))
\ No newline at end of file
import os
from util import util,ffmpeg
'''
---------------------Video Init---------------------
'''
def video_init(opt,path):
fps,endtime,height,width = ffmpeg.get_video_infos(path)
if opt.fps !=0:
fps = opt.fps
# resume
if os.path.isfile(os.path.join(opt.temp_dir,'step.json')):
step = util.loadjson(os.path.join(opt.temp_dir,'step.json'))
if int(step['step'])>=1:
choose = input('There is an unfinished video. Continue it? [y/n] ')
if choose.lower() =='yes' or choose.lower() == 'y':
imagepaths = os.listdir(opt.temp_dir+'/video2image')
imagepaths.sort()
return fps,imagepaths,height,width
print('Step:1/4 -- Convert video to images')
util.file_init(opt)
ffmpeg.video2voice(path,opt.temp_dir+'/voice_tmp.mp3',opt.start_time,opt.last_time)
ffmpeg.video2image(path,opt.temp_dir+'/video2image/output_%06d.'+opt.tempimage_type,fps,opt.start_time,opt.last_time)
imagepaths = os.listdir(opt.temp_dir+'/video2image')
imagepaths.sort()
step = {'step':2,'frame':0}
util.savejson(os.path.join(opt.temp_dir,'step.json'),step)
return fps,imagepaths,height,width
\ No newline at end of file
import argparse
import os
import sys
class Options():
......@@ -10,10 +11,11 @@ class Options():
def initialize(self):
#base
self.parser.add_argument('--use_gpu', type=int,default=0, help='if -1, use cpu')
self.parser.add_argument('--debug', action='store_true', help='if specified, start debug mode')
self.parser.add_argument('--gpu_id', type=str,default='0', help='if -1, use cpu')
self.parser.add_argument('--media_path', type=str, default='./imgs/ruoruo.jpg',help='your videos or images path')
self.parser.add_argument('-ss', '--start_time', type=str, default='00:00:00',help='start position of video, default is the beginning of video')
self.parser.add_argument('-t', '--last_time', type=str, default='00:00:00',help='limit the duration of the video, default is the entire video')
self.parser.add_argument('-t', '--last_time', type=str, default='00:00:00',help='duration of the video, default is the entire video')
self.parser.add_argument('--mode', type=str, default='auto',help='Program running mode. auto | add | clean | style')
self.parser.add_argument('--model_path', type=str, default='./pretrained_models/mosaic/add_face.pth',help='pretrained model path')
self.parser.add_argument('--result_dir', type=str, default='./result',help='output media will be saved here')
......@@ -24,7 +26,7 @@ class Options():
self.parser.add_argument('--fps', type=int, default=0,help='read and output fps, if 0-> origin')
self.parser.add_argument('--no_preview', action='store_true', help='if specified,do not preview images when processing video. eg.(when run it on server)')
self.parser.add_argument('--output_size', type=int, default=0,help='size of output media, if 0 -> origin')
self.parser.add_argument('--mask_threshold', type=int, default=64,help='threshold of recognize clean or add mosaic position 0~255')
self.parser.add_argument('--mask_threshold', type=int, default=64,help='Mosaic detection threshold (0~255). The smaller is it, the more likely judged as a mosaic area.')
#AddMosaic
self.parser.add_argument('--mosaic_mod', type=str, default='squa_avg',help='type of mosaic -> squa_avg | squa_random | squa_avg_circle_edge | rect_avg | random')
......@@ -58,62 +60,71 @@ class Options():
model_name = os.path.basename(self.opt.model_path)
self.opt.temp_dir = os.path.join(self.opt.temp_dir, 'DeepMosaics_temp')
os.environ["CUDA_VISIBLE_DEVICES"] = str(self.opt.use_gpu)
import torch
if torch.cuda.is_available() and self.opt.use_gpu > -1:
pass
else:
self.opt.use_gpu = -1
if self.opt.gpu_id != '-1':
os.environ["CUDA_VISIBLE_DEVICES"] = str(self.opt.gpu_id)
import torch
if not torch.cuda.is_available():
self.opt.gpu_id = '-1'
# else:
# self.opt.gpu_id = '-1'
if test_flag:
if not os.path.exists(self.opt.media_path):
print('Error: Bad media path!')
print('Error: Media does not exist!')
input('Please press any key to exit.\n')
exit(0)
if self.opt.mode == 'auto':
if 'clean' in model_name or self.opt.traditional:
self.opt.mode = 'clean'
elif 'add' in model_name:
self.opt.mode = 'add'
elif 'style' in model_name or 'edges' in model_name:
self.opt.mode = 'style'
else:
print('Please input running model!')
input('Please press any key to exit.\n')
exit(0)
if self.opt.output_size == 0 and self.opt.mode == 'style':
self.opt.output_size = 512
if 'edges' in model_name or 'edges' in self.opt.preprocess:
self.opt.edges = True
if self.opt.netG == 'auto' and self.opt.mode =='clean':
if 'unet_128' in model_name:
self.opt.netG = 'unet_128'
elif 'resnet_9blocks' in model_name:
self.opt.netG = 'resnet_9blocks'
elif 'HD' in model_name and 'video' not in model_name:
self.opt.netG = 'HD'
elif 'video' in model_name:
self.opt.netG = 'video'
else:
print('Type of Generator error!')
sys.exit(0)
if not os.path.exists(self.opt.model_path):
print('Error: Model does not exist!')
input('Please press any key to exit.\n')
exit(0)
if self.opt.ex_mult == 'auto':
if 'face' in model_name:
self.opt.ex_mult = 1.1
sys.exit(0)
if self.opt.mode == 'auto':
if 'clean' in model_name or self.opt.traditional:
self.opt.mode = 'clean'
elif 'add' in model_name:
self.opt.mode = 'add'
elif 'style' in model_name or 'edges' in model_name:
self.opt.mode = 'style'
else:
print('Please check model_path!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.output_size == 0 and self.opt.mode == 'style':
self.opt.output_size = 512
if 'edges' in model_name or 'edges' in self.opt.preprocess:
self.opt.edges = True
if self.opt.netG == 'auto' and self.opt.mode =='clean':
if 'unet_128' in model_name:
self.opt.netG = 'unet_128'
elif 'resnet_9blocks' in model_name:
self.opt.netG = 'resnet_9blocks'
elif 'HD' in model_name and 'video' not in model_name:
self.opt.netG = 'HD'
elif 'video' in model_name:
self.opt.netG = 'video'
else:
print('Type of Generator error!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.ex_mult == 'auto':
if 'face' in model_name:
self.opt.ex_mult = 1.1
else:
self.opt.ex_mult = 1.5
else:
self.opt.ex_mult = 1.5
else:
self.opt.ex_mult = float(self.opt.ex_mult)
if self.opt.mosaic_position_model_path == 'auto':
_path = os.path.join(os.path.split(self.opt.model_path)[0],'mosaic_position.pth')
self.opt.mosaic_position_model_path = _path
# print(self.opt.mosaic_position_model_path)
self.opt.ex_mult = float(self.opt.ex_mult)
if self.opt.mosaic_position_model_path == 'auto' and self.opt.mode == 'clean':
_path = os.path.join(os.path.split(self.opt.model_path)[0],'mosaic_position.pth')
if os.path.isfile(_path):
self.opt.mosaic_position_model_path = _path
else:
input('Please check mosaic_position_model_path!')
input('Please press any key to exit.\n')
sys.exit(0)
return self.opt
\ No newline at end of file
import os
import time
import numpy as np
import cv2
from models import runmodel
from util import mosaic,util,ffmpeg,filt
from util import image_processing as impro
from .init import video_init
'''
---------------------Style Transfer---------------------
'''
def styletransfer_img(opt,netG):
print('Style Transfer_img:',opt.media_path)
img = impro.imread(opt.media_path)
img = runmodel.run_styletransfer(opt, netG, img)
suffix = os.path.basename(opt.model_path).replace('.pth','').replace('style_','')
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(opt.media_path))[0]+'_'+suffix+'.jpg'),img)
def styletransfer_video(opt,netG):
path = opt.media_path
fps,imagepaths = video_init(opt,path)[:2]
print('Step:2/4 -- Transfer')
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
length = len(imagepaths)
for i,imagepath in enumerate(imagepaths,1):
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
img = runmodel.run_styletransfer(opt, netG, img)
cv2.imwrite(os.path.join(opt.temp_dir+'/style_transfer',imagepath),img)
os.remove(os.path.join(opt.temp_dir+'/video2image',imagepath))
#preview result and print
if not opt.no_preview:
cv2.imshow('preview',img)
cv2.waitKey(1) & 0xFF
t2 = time.time()
print('\r',str(i)+'/'+str(length),util.get_bar(100*i/length,num=35),util.counttime(t1,t2,i,len(imagepaths)),end='')
print()
if not opt.no_preview:
cv2.destroyAllWindows()
suffix = os.path.basename(opt.model_path).replace('.pth','').replace('style_','')
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/style_transfer/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_'+suffix+'.mp4'))
\ No newline at end of file
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
set(CMAKE_CXX_STANDARD 14)
project(DeepMosaics)
set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib) #链接库路径
set(Torch_DIR /home/hypo/libtorch/share/cmake/Torch)
find_package(Torch REQUIRED)
set(OpenCV_DIR /home/hypo/opencv-4.4.0)
find_package(OpenCV REQUIRED)
# Add sub directories
add_subdirectory(example)
add_subdirectory(utils)
# set_property(TARGET ${PROJECT_NAME} PROPERTY CXX_STANDARD 14)
# cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
# project(main)
# set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib) #链接库路径
# set(Torch_DIR /home/hypo/libtorch/share/cmake/Torch)
# find_package(Torch REQUIRED)
# set(OpenCV_DIR /home/hypo/opencv-4.4.0)
# find_package(OpenCV REQUIRED)
# # 查找当前目录下的所有源文件
# # 并将名称保存到 DIR_SRCS 变量
# # aux_source_directory(. DIR_SRCS)
# add_subdirectory(utils)
# add_executable(main main.cpp)
# # target_link_libraries(main )
# # include_directories( "${OpenCV_INCLUDE_DIRS}" )
# target_link_libraries( main "${TORCH_LIBRARIES}" "${OpenCV_LIBS}" utils)
# set_property(TARGET main PROPERTY CXX_STANDARD 14)
### C++ version for DeepMosaics
* I am learning c++ through this project...
* It is under development...
\ No newline at end of file
# project(example)
# add_executable("${PROJECT_NAME}" deepmosaic.cpp)
# target_link_libraries( "${PROJECT_NAME}"
# "${TORCH_LIBRARIES}"
# "${OpenCV_LIBS}"
# utils)
file(GLOB_RECURSE srcs RELATIVE "${CMAKE_CURRENT_SOURCE_DIR}" "${CMAKE_CURRENT_SOURCE_DIR}/*.cpp")
foreach(sourcefile IN LISTS srcs)
string( REPLACE ".cpp" "" binname ${sourcefile})
add_executable( ${binname} ${sourcefile} )
target_link_libraries( ${binname}
"${TORCH_LIBRARIES}"
"${OpenCV_LIBS}"
utils)
# set_property(TARGET ${binname} PROPERTY CXX_STANDARD 14)
endforeach()
\ No newline at end of file
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <iostream>
#include <list>
#include <vector>
#include <torch/script.h>
#include <torch/torch.h>
#include <opencv2/opencv.hpp>
#include "data.hpp"
#include "util.hpp"
int main() {
std::string path = util::current_path();
std::string net_path = "../res/models/mosaic_position.pth";
std::string img_path = "../res/test_media/face/d.jpg";
cv::Mat img = cv::imread(img_path);
cv::resize(img, img, cv::Size(360, 360), 2);
// img.convertTo(img, CV_32F);
torch::Tensor img_tensor =
torch::from_blob(img.data, {1, img.rows, img.cols, 3}, torch::kByte);
img_tensor = img_tensor.permute({0, 3, 1, 2});
img_tensor = img_tensor.toType(torch::kFloat);
img_tensor = img_tensor.div(255);
std::cout << img_tensor.sizes() << "\n";
// end = clock();
// dur = (double)(end - start);
// printf("Use Time:%f\n", (dur / CLOCKS_PER_SEC));
// std::string net_path = "../res/models/mosaic_position.pt";
// torch::jit::script::Module net;
// try{
// // if (!isfile(net_path)){
// // std::cerr<<"model does not exist\n";
// // }
// net = torch::jit::load(net_path);
// }
// catch(const std::exception& e){
// std::cerr << "error loading the model\n";
// return -1;
// }
// torch::Tensor example = torch::ones({1,3,360,360});
// torch::Tensor output = net.forward({example}).toTensor();
// std::cout<<"ok"<<std::endl;
}
\ No newline at end of file
# Set the project name
project (utils)
aux_source_directory(./src DIR_LIB_SRCS)
add_library(${PROJECT_NAME} SHARED ${DIR_LIB_SRCS})
# 添加.h文件查找路径
target_include_directories( ${PROJECT_NAME}
PUBLIC ${PROJECT_SOURCE_DIR}/include
)
# 添加掉用的第三方库
target_link_libraries( "${PROJECT_NAME}"
"${OpenCV_LIBS}")
\ No newline at end of file
#ifndef DATA_H
#define DATA_H
#include <opencv2/opencv.hpp>
namespace data {
void normalize(cv::Mat& matrix, double mean = 0.5, double std = 0.5);
} // namespace data
#endif
\ No newline at end of file
#ifndef UTIL_H
#define UTIL_H
#include <iostream>
#include <list>
namespace util {
class Timer {
private:
clock_t tstart, tend;
public:
void start();
void end();
};
// std::string path = util::current_path();
std::string current_path();
// std::string out = util::pathjoin({path, "b", "c"});
std::string pathjoin(const std::list<std::string>& strs);
bool isfile(const std::string& name);
} // namespace util
#endif
\ No newline at end of file
#include "data.hpp"
#include <opencv2/opencv.hpp>
namespace data {
void normalize(cv::Mat& matrix, double mean, double std) {
// matrix = (matrix / 255.0 - mean) / std;
matrix = matrix / (255.0 * std) - mean / std;
}
} // namespace data
\ No newline at end of file
#include "util.hpp"
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
#include <iostream>
#include <list>
#include <vector>
namespace util {
void Timer::start() {
tstart = clock();
}
void Timer::end() {
tend = clock();
double dur;
dur = (double)(tend - tstart);
std::cout << "Cost Time:" << (dur / CLOCKS_PER_SEC) << "\n";
}
std::string current_path() {
char* buffer;
buffer = getcwd(NULL, 0);
return buffer;
}
std::string pathjoin(const std::list<std::string>& strs) {
std::string res = "";
int cnt = 0;
for (std::string s : strs) {
if (cnt == 0) {
res += s;
} else {
if (s[0] != '/') {
res += ("/" + s);
} else {
res += s;
}
}
cnt++;
}
return res;
}
bool isfile(const std::string& name) {
struct stat buffer;
return (stat(name.c_str(), &buffer) == 0);
}
} // namespace util
\ No newline at end of file
import os
import sys
import traceback
from cores import Options,core
from util import util
from models import loadmodel
try:
from cores import Options,add,clean,style
from util import util
from models import loadmodel
except Exception as e:
print(e)
input('Please press any key to exit.\n')
sys.exit(0)
opt = Options().getparse(test_flag = True)
util.file_init(opt)
if not os.path.isdir(opt.temp_dir):
util.file_init(opt)
def main():
......@@ -19,11 +25,13 @@ def main():
for file in files:
opt.media_path = file
if util.is_img(file):
core.addmosaic_img(opt,netS)
add.addmosaic_img(opt,netS)
elif util.is_video(file):
core.addmosaic_video(opt,netS)
add.addmosaic_video(opt,netS)
util.clean_tempfiles(opt, tmp_init = False)
else:
print('This type of file is not supported')
util.clean_tempfiles(opt, tmp_init = False)
elif opt.mode == 'clean':
netM = loadmodel.bisenet(opt,'mosaic')
......@@ -37,12 +45,13 @@ def main():
for file in files:
opt.media_path = file
if util.is_img(file):
core.cleanmosaic_img(opt,netG,netM)
clean.cleanmosaic_img(opt,netG,netM)
elif util.is_video(file):
if opt.netG == 'video' and not opt.traditional:
core.cleanmosaic_video_fusion(opt,netG,netM)
clean.cleanmosaic_video_fusion(opt,netG,netM)
else:
core.cleanmosaic_video_byframe(opt,netG,netM)
clean.cleanmosaic_video_byframe(opt,netG,netM)
util.clean_tempfiles(opt, tmp_init = False)
else:
print('This type of file is not supported')
......@@ -51,20 +60,35 @@ def main():
for file in files:
opt.media_path = file
if util.is_img(file):
core.styletransfer_img(opt,netG)
style.styletransfer_img(opt,netG)
elif util.is_video(file):
core.styletransfer_video(opt,netG)
style.styletransfer_video(opt,netG)
util.clean_tempfiles(opt, tmp_init = False)
else:
print('This type of file is not supported')
util.clean_tempfiles(opt, tmp_init = False)
if __name__ == '__main__':
if opt.debug:
main()
sys.exit(0)
try:
main()
print('Finished!')
except Exception as ex:
print('--------------------ERROR--------------------')
print('--------------Environment--------------')
print('DeepMosaics: 0.5.1')
print('Python:',sys.version)
import torch
print('Pytorch:',torch.__version__)
import cv2
print('OpenCV:',cv2.__version__)
import platform
print('Platform:',platform.platform())
print('--------------BUG--------------')
ex_type, ex_val, ex_stack = sys.exc_info()
print('Error Type:',ex_type)
print(ex_val)
......@@ -72,5 +96,4 @@ if __name__ == '__main__':
print(stack)
input('Please press any key to exit.\n')
#util.clean_tempfiles(tmp_init = False)
exit(0)
sys.exit(0)
\ No newline at end of file
DeepMosaics V0.3.0
Core program building with windows10_1703_x86_64
+ python 3.68
+ pyinstaller 3.5
DeepMosaics: 0.5.1
Core building with:
Python: 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
Pytorch: 1.7.1
OpenCV: 4.1.2
Platform: Windows-10-10.0.19041-SP0
Driver Version: 461.40
CUDA:11.0
GUI building with C#
For more detail, please view on github: https://github.com/HypoX64/DeepMosaics
Releases History
V0.3.0
1. Support BiSeNet(Better recognition of mosaics).
2. New videoHD model.
3. Better feathering method.
V0.2.0
1. Add video model.
2. Now you can input chinese path
3. Support style transfer
4. Support fps limit
V0.1.2
1. Support pix2pixHD model
V0.1.1
1. Check path, can't input illegal path
V0.1.0
1. Initial release.
\ No newline at end of file
V0.5.1
Fix:
1.Fix Some BUGs when restore unfinished tasks.
2.Fix that audio and video are not synchronized when the video is too long.
New:
1.Speed up video processing by Asynchronous.
V0.5.0
1.New video model (Perform better)
V0.4.1
1.Allow unfinished tasks to be restored.
2.Clean cache during processing.
3.Support CUDA 11.0.
V0.4.0
1.Support GPU.
2.Preview images when processing video.
3.Choose start position of video.
V0.3.0
1. Support BiSeNet(Better recognition of mosaics).
2. New videoHD model.
3. Better feathering method.
V0.2.0
1. Add video model.
2. Now you can input chinese path
3. Support style transfer
4. Support fps limit
V0.1.2
1. Support pix2pixHD model
V0.1.1
1. Check path, can't input illegal path
V0.1.0
1. Initial release.
\ No newline at end of file
## DeepMosaics.exe Instructions
[[中文版]](./exe_help_CN.md)
**[[中文版]](./exe_help_CN.md)**
This is a GUI version compiled in Windows.<br>
Download this version and pre-trained model via [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
Video tutorial => [[youtube]](https://www.youtube.com/watch?v=1kEmYawJ_vk) [[bilibili]](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
Attentions:<br>
- Require Windows_x86_64, Windows10 is better.<br>
......@@ -9,11 +11,29 @@ Attentions:<br>
- Run time depends on computer performance.<br>
- If output video cannot be played, you can try with [potplayer](https://daumpotplayer.com/download/).<br>
- GUI version update slower than source.<br>
### How to install
#### CPU version
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
#### GPU version
Only suppport NVidia GPU above gtx1060(Driver:above 460 & CUDA:11.0)
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
* 2.Update your gpu drive to 460(or above)
https://www.nvidia.com/en-us/geforce/drivers/
* 3.Download and install CUDA 11.0:
https://developer.nvidia.com/cuda-toolkit-archive
You can also download them on BaiduNetdisk
https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ
Password: 1x0a
### How to use
* step 1: Choose image or video.
* step 2: Choose model(Different pre-trained models are suitable for different effects)
* step3: Run program and wait.
* step4: Cheek reult in './result'.
* step 3: Run program and wait.
* step 4: Cheek reult in './result'.
### Introduction to pre-trained models
* Mosaic
......@@ -22,10 +42,10 @@ Attentions:<br>
| :------------------------------: | :---------------------------------------------------------: |
| add_face.pth | Add mosaic to all faces in images/videos. |
| clean_face_HD.pth | Clean mosaic to all faces in images/video.<br>(RAM > 8GB). |
| add_youknow.pth | Add mosaic to all (FBI Warning) in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to all (FBI Warning) in images/videos. |
| clean_youknow_video.pth | Clean mosaic to all (FBI Warning) in videos. |
| clean_youknow_video_HD.pth | Clean mosaic to all (FBI Warning) in videos.<br>(RAM > 8GB) |
| add_youknow.pth | Add mosaic to ... in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to ... in images/videos. |
| clean_youknow_video.pth | Clean mosaic to ... in videos. It is better for processing video mosaics |
* Style Transfer
......@@ -50,8 +70,8 @@ Attentions:<br>
* 7. More options can be input.
* 8. Run program.
* 9. Open help file.
* 10. Sponsor our project.
* 11. Version information.
* 10. Sponsor our project.
* 11. Version information.
* 12. Open the URL on github.
### Introduction to options
......@@ -60,7 +80,7 @@ If you need more effects, use '--option your-parameters' to enter what you need
| Option | Description | Default |
| :----------: | :----------------------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | your videos or images path | ./imgs/ruoruo.jpg |
| --mode | program running mode(auto/clean/add/style) | 'auto' |
| --model_path | pretrained model path | ./pretrained_models/mosaic/add_face.pth |
......
## DeepMosaics.exe 使用说明
下载程序以及预训练模型 [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
[视频教程](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
注意事项:<br>
- 程序的运行要求在64位Windows操作系统,我仅在Windows10运行过,其他版本暂未经过测试<br>
- 程序的运行要求在64位Windows操作系统,我们仅在Windows10运行过,其他版本暂未经过测试<br>
- 请根据需求选择合适的预训练模型进行测试<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用源码以及GPU运行<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用GPU运行<br>
- 如果输出的视频无法播放,这边建议您尝试[potplayer](https://daumpotplayer.com/download/).<br>
- 相比于源码,该版本的更新将会延后.
### 如何安装
#### CPU version
* 1.下载安装 Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
#### GPU version
仅支持gtx1060及以上的NVidia显卡(要求460版本以上的驱动以及11.0版本的CUDA, 注意只能是11.0)
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
* 2.Update your gpu drive to 460(or above)
https://www.nvidia.com/en-us/geforce/drivers/
* 3.Download and install CUDA 11.0:
https://developer.nvidia.com/cuda-toolkit-archive
当然这些也能在百度云上下载
https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ
提取码: 1x0a
### 如何使用
* step 1: 选择需要处理的图片或视频
* step 2: 选择预训练模型(不同的预训练模型有不同的效果)
* step3: 运行程序并等待
* step4: 查看结果(储存在result文件夹下)
* step 3: 运行程序并等待
* step 4: 查看结果(储存在result文件夹下)
## 预训练模型说明
当前的预训练模型分为两类——添加/移除马赛克以及风格转换.
......@@ -23,10 +44,10 @@
| :------------------------------: | :-------------------------------------------: |
| add_face.pth | 对图片或视频中的脸部打码 |
| clean_face_HD.pth | 对图片或视频中的脸部去码<br>(要求内存 > 8GB). |
| add_youknow.pth | 对图片或视频中的十八禁内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的十八禁内容去码 |
| clean_youknow_video.pth | 对视频中的十八禁内容去码 |
| clean_youknow_video_HD.pth | 对视频中的十八禁内容去码<br>(要求内存 > 8GB) |
| add_youknow.pth | 对图片或视频中的...内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的...内容去码 |
| clean_youknow_video.pth | 对视频中的...内容去码,推荐使用带有'video'的模型去除视频中的马赛克 |
* 风格转换
......@@ -52,8 +73,8 @@
* 7. 自行输入更多参数,详见下文
* 8. 运行
* 9. 打开帮助文件
* 10. 支持我们
* 11. 版本信息
* 10. 支持我们
* 11. 版本信息
* 12. 打开项目的github页面
### 参数说明
......@@ -62,7 +83,7 @@
| 选项 | 描述 | 默认 |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | 需要处理的视频或者照片的路径 | ./imgs/ruoruo.jpg |
| --mode | 运行模式(auto/clean/add/style) | 'auto' |
| --model_path | 预训练模型的路径 | ./pretrained_models/mosaic/add_face.pth |
......@@ -75,7 +96,7 @@
| --mosaic_mod | 马赛克类型 -> squa_avg/ squa_random/ squa_avg_circle_edge/ rect_avg/random | squa_avg |
| --mosaic_size | 马赛克大小,0则为自动 | 0 |
| --mask_extend | 拓展马赛克区域 | 10 |
| --mask_threshold | 马赛克区域识别阈值 0~255 | 64 |
| --mask_threshold | 马赛克区域识别阈值 0~255,越小越容易被判断为马赛克区域 | 64 |
* 去除马赛克
......
......@@ -5,7 +5,7 @@ If you need more effects, use '--option your-parameters' to enter what you need
| Option | Description | Default |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | your videos or images path | ./imgs/ruoruo.jpg |
| --start_time | start position of video, default is the beginning of video | '00:00:00' |
| --last_time | limit the duration of the video, default is the entire video | '00:00:00' |
......
......@@ -5,7 +5,7 @@
| 选项 | 描述 | 默认 |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | 需要处理的视频或者照片的路径 | ./imgs/ruoruo.jpg |
| --start_time | 视频开始处理的位置,默认从头开始 | '00:00:00' |
| --last_time | 处理的视频时长,默认是整个视频 | '00:00:00' |
......
......@@ -10,8 +10,8 @@ Download pre-trained model via [[Google Drive]](https://drive.google.com/open?i
| clean_face_HD.pth | Clean mosaic to faces in images/video.<br>(RAM > 8GB). |
| add_youknow.pth | Add mosaic to ... in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to ... in images/videos. |
| clean_youknow_video.pth | Clean mosaic to ... in videos. |
| clean_youknow_video_HD.pth | Clean mosaic to ... in videos.<br>(RAM > 8GB) |
| clean_youknow_video.pth | Clean mosaic to ... in videos. It is better for processing video mosaics |
### Style Transfer
......
......@@ -10,8 +10,8 @@
| clean_face_HD.pth | 对图片或视频中的脸部去码<br>(要求内存 > 8GB). |
| add_youknow.pth | 对图片或视频中的...内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的...内容去码 |
| clean_youknow_video.pth | 对视频中的...内容去码 |
| clean_youknow_video_HD.pth | 对视频中的...内容去码<br>(要求内存 > 8GB) |
| clean_youknow_video.pth | 对视频中的...内容去码,推荐使用带有'video'的模型去除视频中的马赛克 |
### 风格转换
......
......@@ -10,7 +10,11 @@ We will make "face" as an example. If you don't have any picture, you can downlo
- [Pytorch 1.0+](https://pytorch.org/)
- NVIDIA GPU(with more than 6G memory) + CUDA CuDNN<br>
#### Dependencies
This code depends on opencv-python, torchvision, matplotlib available via pip install.
This code depends on opencv-python, torchvision, matplotlib, tensorboardX, scikit-image available via conda install.
```bash
# or
pip install -r requirements.txt
```
#### Clone this repo
```bash
git clone https://github.com/HypoX64/DeepMosaics
......@@ -32,31 +36,31 @@ python draw_mask.py --datadir 'dir for your pictures' --savedir ../datasets/draw
python get_image_from_video.py --datadir 'dir for your videos' --savedir ../datasets/video2image --fps 1
```
### Clean mosaic dataset
We provide several methods for generating clean mosaic datasets. However, for better effect, we recommend train a addmosaic model in a small data first and use it to automatically generate datasets in a big data.(recommend: Method 2(for image) & Method 4(for video))
* Method 1: Use drawn mask to make pix2pix(HD) datasets(Require``` origin_image``` and ```mask```)
We provide several methods for generating clean mosaic datasets. However, for better effect, we recommend train a addmosaic model in a small data first and use it to automatically generate datasets in a big data. (recommend: Method 2(for image) & Method 4(for video))
* Method 1: Use drawn mask to make pix2pix(HD) datasets (Require``` origin_image``` and ```mask```)
```bash
python make_pix2pix_dataset.py --datadir ../datasets/draw/face --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod drawn --minsize 128 --square
```
* Method 2: Use addmosaic model to make pix2pix(HD) datasets(Require addmosaic pre-trained model)
* Method 2: Use addmosaic model to make pix2pix(HD) datasets (Require addmosaic pre-trained model)
```bash
python make_pix2pix_dataset.py --datadir 'dir for your pictures' --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod network --model_path ../pretrained_models/mosaic/add_face.pth --minsize 128 --square --mask_threshold 128
```
* Method 3: Use Irregular Masks to make pix2pix(HD) datasets(Require [Irregular Masks](https://nv-adlr.github.io/publication/partialconv-inpainting))
* Method 3: Use Irregular Masks to make pix2pix(HD) datasets (Require [Irregular Masks](https://nv-adlr.github.io/publication/partialconv-inpainting))
```bash
python make_pix2pix_dataset.py --datadir 'dir for your pictures' --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod irregular --irrholedir ../datasets/Irregular_Holes_mask --square
```
* Method 4: Use addmosaic model to make video datasets(Require addmosaic pre-trained model. This is better for processing video mosaics)
* Method 4: Use addmosaic model to make video datasets (Require addmosaic pre-trained model. This is better for processing video mosaics)
```bash
python make_video_dataset.py --datadir 'dir for your videos' --model_path ../pretrained_models/mosaic/add_face.pth --mask_threshold 96 --savedir ../datasets/video/face
python make_video_dataset.py --model_path ../pretrained_models/mosaic/add_face.pth --gpu_id 0 --datadir 'dir for your videos' --savedir ../datasets/video/face
```
## Training
### Add
```bash
cd train/add
python train.py --use_gpu 0 --dataset ../../datasets/draw/face --savename face --loadsize 512 --finesize 360 --batchsize 16
python train.py --gpu_id 0 --dataset ../../datasets/draw/face --savename face --loadsize 512 --finesize 360 --batchsize 16
```
### Clean
* For image datasets(generated by ```make_pix2pix_dataset.py```)
* For image datasets (generated by ```make_pix2pix_dataset.py```)
We use [pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) or [pix2pixHD](https://github.com/NVIDIA/pix2pixHD) to train model. We just take pix2pixHD as an example.
```bash
git clone https://github.com/NVIDIA/pix2pixHD
......@@ -64,10 +68,10 @@ cd pix2pixHD
pip install dominate
python train.py --name face --resize_or_crop resize_and_crop --loadSize 563 --fineSize 512 --label_nc 0 --no_instance --dataroot ../datasets/pix2pix/face
```
* For video datasets(generated by ```make_video_dataset.py```)
* For video datasets (generated by ```make_video_dataset.py```)
```bash
cd train/clean
python train.py --dataset ../../datasets/video/face --savename face --savefreq 100000 --gan --hd --lr 0.0002 --lambda_gan 1 --use_gpu 0
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 4,5,6,7 --load_thread 16
```
## Testing
Put saved network to ```./pretrained_models/mosaic/``` and rename it as ```add_face.pth``` or ```clean_face_HD.pth``` or ```clean_face_video_HD.pth```
Put saved network to ```./pretrained_models/mosaic/``` and rename it as ```add_face.pth``` or ```clean_face_HD.pth``` or ```clean_face_video_HD.pth```and then run ```deepmosaic.py --model_path ./pretrained_models/mosaic/your_model_name```
此差异已折叠。
......@@ -16,7 +16,7 @@ import torch
from models import runmodel,loadmodel
import util.image_processing as impro
from util import util,mosaic,data
from util import degradater, util,mosaic,data
opt.parser.add_argument('--datadir',type=str,default='../datasets/draw/face', help='')
......@@ -87,11 +87,11 @@ for fold in range(opt.fold):
mask = mask_drawn
if 'irregular' in opt.mod:
mask_irr = impro.imread(irrpaths[random.randint(0,12000-1)],'gray')
mask_irr = data.random_transform_single(mask_irr, (img.shape[0],img.shape[1]))
mask_irr = data.random_transform_single_mask(mask_irr, (img.shape[0],img.shape[1]))
mask = mask_irr
if 'network' in opt.mod:
mask_net = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
if opt.use_gpu != -1:
if opt.gpu_id != -1:
torch.cuda.empty_cache()
if not opt.all_mosaic_area:
mask_net = impro.find_mostlikely_ROI(mask_net)
......@@ -107,11 +107,11 @@ for fold in range(opt.fold):
saveflag = True
if opt.mod == ['drawn','irregular']:
x,y,size,area = impro.boundingSquare(mask_drawn, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask_drawn, random.uniform(1.1,1.6))
elif opt.mod == ['network','irregular']:
x,y,size,area = impro.boundingSquare(mask_net, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask_net, random.uniform(1.1,1.6))
else:
x,y,size,area = impro.boundingSquare(mask, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask, random.uniform(1.1,1.6))
if area < 1000:
saveflag = False
......@@ -130,11 +130,15 @@ for fold in range(opt.fold):
if saveflag:
# add mosaic
img_mosaic = mosaic.addmosaic_random(img, mask)
# random blur
# random degradater
if random.random()>0.5:
Q = random.randint(1,15)
img = impro.dctblur(img,Q)
img_mosaic = impro.dctblur(img_mosaic,Q)
degradate_params = degradater.get_random_degenerate_params(mod='weaker_2')
img = degradater.degradate(img,degradate_params)
img_mosaic = degradater.degradate(img_mosaic,degradate_params)
# if random.random()>0.5:
# Q = random.randint(1,15)
# img = impro.dctblur(img,Q)
# img_mosaic = impro.dctblur(img_mosaic,Q)
savecnt += 1
......
......@@ -14,7 +14,7 @@ import torch
from models import runmodel,loadmodel
import util.image_processing as impro
from util import util,mosaic,data,ffmpeg
from util import filt, util,mosaic,data,ffmpeg
opt.parser.add_argument('--datadir',type=str,default='your video dir', help='')
......@@ -56,6 +56,7 @@ for videopath in videopaths:
ffmpeg.video2image(videopath, opt.temp_dir+'/video2image/%05d.'+opt.tempimage_type,fps=1,
start_time = util.second2stamp(cut_point*opt.interval),last_time = util.second2stamp(opt.time))
imagepaths = util.Traversal(opt.temp_dir+'/video2image')
imagepaths = sorted(imagepaths)
cnt = 0
for i in range(opt.time):
img = impro.imread(imagepaths[i])
......@@ -92,30 +93,65 @@ for videopath in videopaths:
imagepaths = util.Traversal(opt.temp_dir+'/video2image')
imagepaths = sorted(imagepaths)
imgs=[];masks=[]
mask_flag = False
# mask_flag = False
# for imagepath in imagepaths:
# img = impro.imread(imagepath)
# mask = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
# imgs.append(img)
# masks.append(mask)
# if not mask_flag:
# mask_avg = mask.astype(np.float64)
# mask_flag = True
# else:
# mask_avg += mask.astype(np.float64)
# mask_avg = np.clip(mask_avg/len(imagepaths),0,255).astype('uint8')
# mask_avg = impro.mask_threshold(mask_avg,20,64)
# if not opt.all_mosaic_area:
# mask_avg = impro.find_mostlikely_ROI(mask_avg)
# x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=random.uniform(1.1,1.5))
# for i in range(len(imagepaths)):
# img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
# impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
ex_mul = random.uniform(1.2,1.7)
positions = []
for imagepath in imagepaths:
img = impro.imread(imagepath)
mask = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
imgs.append(img)
masks.append(mask)
if not mask_flag:
mask_avg = mask.astype(np.float64)
mask_flag = True
else:
mask_avg += mask.astype(np.float64)
mask_avg = np.clip(mask_avg/len(imagepaths),0,255).astype('uint8')
mask_avg = impro.mask_threshold(mask_avg,20,64)
if not opt.all_mosaic_area:
mask_avg = impro.find_mostlikely_ROI(mask_avg)
x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=random.uniform(1.1,1.5))
for i in range(len(imagepaths)):
img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
x,y,size,area = impro.boundingSquare(mask,Ex_mul=ex_mul)
positions.append([x,y,size])
positions =np.array(positions)
for i in range(3):positions[:,i] = filt.medfilt(positions[:,i],opt.medfilt_num)
for i,imagepath in enumerate(imagepaths):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
tmp_cnt = i
while size<opt.minsize//2:
tmp_cnt = tmp_cnt-1
x,y,size = positions[tmp_cnt][0],positions[tmp_cnt][1],positions[tmp_cnt][2]
img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
# x_tmp,y_tmp,size_tmp
# for imagepath in imagepaths:
# img = impro.imread(imagepath)
# mask,x,y,halfsize,area = runmodel.get_ROI_position(img,net,opt,keepsize=True)
# if halfsize>opt.minsize//4:
# if not opt.all_mosaic_area:
# mask_avg = impro.find_mostlikely_ROI(mask_avg)
# x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=ex_mul)
# img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
# impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
result_cnt+=1
......@@ -124,5 +160,5 @@ for videopath in videopaths:
util.writelog(os.path.join(opt.savedir,'opt.txt'),
videopath+'\n'+str(result_cnt)+'\n'+str(e))
video_cnt +=1
if opt.use_gpu != -1:
if opt.gpu_id != '-1':
torch.cuda.empty_cache()
import torch
import torch.nn as nn
from .pix2pixHD_model import *
from .model_util import *
from models import model_util
class UpBlock(nn.Module):
def __init__(self, in_channel, out_channel, kernel_size=3, padding=1):
super().__init__()
self.convup = nn.Sequential(
nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),
nn.ReflectionPad2d(padding),
# EqualConv2d(out_channel, out_channel, kernel_size, padding=padding),
SpectralNorm(nn.Conv2d(in_channel, out_channel, kernel_size)),
nn.LeakyReLU(0.2),
# Blur(out_channel),
)
def forward(self, input):
outup = self.convup(input)
return outup
class Encoder2d(nn.Module):
def __init__(self, input_nc, ngf=64, n_downsampling=3, activation = nn.LeakyReLU(0.2)):
super(Encoder2d, self).__init__()
model = [nn.ReflectionPad2d(3), SpectralNorm(nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0)), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [ nn.ReflectionPad2d(1),
SpectralNorm(nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=0)),
activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class Encoder3d(nn.Module):
def __init__(self, input_nc, ngf=64, n_downsampling=3, activation = nn.LeakyReLU(0.2)):
super(Encoder3d, self).__init__()
model = [SpectralNorm(nn.Conv3d(input_nc, ngf, kernel_size=3, padding=1)), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [ SpectralNorm(nn.Conv3d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1)),
activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class BVDNet(nn.Module):
def __init__(self, N=2, n_downsampling=3, n_blocks=4, input_nc=3, output_nc=3,activation=nn.LeakyReLU(0.2)):
super(BVDNet, self).__init__()
ngf = 64
padding_type = 'reflect'
self.N = N
### encoder
self.encoder3d = Encoder3d(input_nc,64,n_downsampling,activation)
self.encoder2d = Encoder2d(input_nc,64,n_downsampling,activation)
### resnet blocks
self.blocks = []
mult = 2**n_downsampling
for i in range(n_blocks):
self.blocks += [ResnetBlockSpectralNorm(ngf * mult, padding_type=padding_type, activation=activation)]
self.blocks = nn.Sequential(*self.blocks)
### decoder
self.decoder = []
for i in range(n_downsampling):
mult = 2**(n_downsampling - i)
self.decoder += [UpBlock(ngf * mult, int(ngf * mult / 2))]
self.decoder += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0)]
self.decoder = nn.Sequential(*self.decoder)
self.limiter = nn.Tanh()
def forward(self, stream, previous):
this_shortcut = stream[:,:,self.N]
stream = self.encoder3d(stream)
stream = stream.reshape(stream.size(0),stream.size(1),stream.size(3),stream.size(4))
previous = self.encoder2d(previous)
x = stream + previous
x = self.blocks(x)
x = self.decoder(x)
x = x+this_shortcut
x = self.limiter(x)
return x
def define_G(N=2, n_blocks=1, gpu_id='-1'):
netG = BVDNet(N = N, n_blocks=n_blocks)
netG = model_util.todevice(netG,gpu_id)
netG.apply(model_util.init_weights)
return netG
################################Discriminator################################
def define_D(input_nc=6, ndf=64, n_layers_D=1, use_sigmoid=False, num_D=3, gpu_id='-1'):
netD = MultiscaleDiscriminator(input_nc, ndf, n_layers_D, use_sigmoid, num_D)
netD = model_util.todevice(netD,gpu_id)
netD.apply(model_util.init_weights)
return netD
class MultiscaleDiscriminator(nn.Module):
def __init__(self, input_nc, ndf=64, n_layers=3, use_sigmoid=False, num_D=3):
super(MultiscaleDiscriminator, self).__init__()
self.num_D = num_D
self.n_layers = n_layers
for i in range(num_D):
netD = NLayerDiscriminator(input_nc, ndf, n_layers, use_sigmoid)
setattr(self, 'layer'+str(i), netD.model)
self.downsample = nn.AvgPool2d(3, stride=2, padding=[1, 1], count_include_pad=False)
def singleD_forward(self, model, input):
return [model(input)]
def forward(self, input):
num_D = self.num_D
result = []
input_downsampled = input
for i in range(num_D):
model = getattr(self, 'layer'+str(num_D-1-i))
result.append(self.singleD_forward(model, input_downsampled))
if i != (num_D-1):
input_downsampled = self.downsample(input_downsampled)
return result
# Defines the PatchGAN discriminator with the specified arguments.
class NLayerDiscriminator(nn.Module):
def __init__(self, input_nc, ndf=64, n_layers=3, use_sigmoid=False):
super(NLayerDiscriminator, self).__init__()
self.n_layers = n_layers
kw = 4
padw = int(np.ceil((kw-1.0)/2))
sequence = [[nn.Conv2d(input_nc, ndf, kernel_size=kw, stride=2, padding=padw), nn.LeakyReLU(0.2)]]
nf = ndf
for n in range(1, n_layers):
nf_prev = nf
nf = min(nf * 2, 512)
sequence += [[
SpectralNorm(nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=2, padding=padw)),
nn.LeakyReLU(0.2)
]]
nf_prev = nf
nf = min(nf * 2, 512)
sequence += [[
SpectralNorm(nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=1, padding=padw)),
nn.LeakyReLU(0.2)
]]
sequence += [[nn.Conv2d(nf, 1, kernel_size=kw, stride=1, padding=padw)]]
if use_sigmoid:
sequence += [[nn.Sigmoid()]]
sequence_stream = []
for n in range(len(sequence)):
sequence_stream += sequence[n]
self.model = nn.Sequential(*sequence_stream)
def forward(self, input):
return self.model(input)
class GANLoss(nn.Module):
def __init__(self, mode='D'):
super(GANLoss, self).__init__()
if mode == 'D':
self.lossf = model_util.HingeLossD()
elif mode == 'G':
self.lossf = model_util.HingeLossG()
self.mode = mode
def forward(self, dis_fake = None, dis_real = None):
if isinstance(dis_fake, list):
if self.mode == 'D':
loss = 0
for i in range(len(dis_fake)):
loss += self.lossf(dis_fake[i][-1],dis_real[i][-1])
elif self.mode =='G':
loss = 0
weight = 2**len(dis_fake)
for i in range(len(dis_fake)):
weight = weight/2
loss += weight*self.lossf(dis_fake[i][-1])
return loss
else:
if self.mode == 'D':
return self.lossf(dis_fake[-1],dis_real[-1])
elif self.mode =='G':
return self.lossf(dis_fake[-1])
......@@ -2,7 +2,7 @@
import torch.nn as nn
import torch
import torch.nn.functional as F
from . import components
from . import model_util
import warnings
warnings.filterwarnings(action='ignore')
......@@ -43,7 +43,7 @@ class DiceLoss(nn.Module):
class resnet18(torch.nn.Module):
def __init__(self, pretrained=True):
super().__init__()
self.features = components.resnet18(pretrained=pretrained)
self.features = model_util.resnet18(pretrained=pretrained)
self.conv1 = self.features.conv1
self.bn1 = self.features.bn1
self.relu = self.features.relu
......@@ -70,7 +70,7 @@ class resnet18(torch.nn.Module):
class resnet101(torch.nn.Module):
def __init__(self, pretrained=True):
super().__init__()
self.features = components.resnet101(pretrained=pretrained)
self.features = model_util.resnet101(pretrained=pretrained)
self.conv1 = self.features.conv1
self.bn1 = self.features.bn1
self.relu = self.features.relu
......
import torch
from .pix2pix_model import define_G
from .pix2pixHD_model import define_G as define_G_HD
from .unet_model import UNet
from .video_model import MosaicNet
from .videoHD_model import MosaicNet as MosaicNet_HD
from . import model_util
from .pix2pix_model import define_G as pix2pix_G
from .pix2pixHD_model import define_G as pix2pixHD_G
# from .video_model import MosaicNet
# from .videoHD_model import MosaicNet as MosaicNet_HD
from .BiSeNet_model import BiSeNet
from .BVDNet import define_G as video_G
def show_paramsnumber(net,netname='net'):
parameters = sum(param.numel() for param in net.parameters())
parameters = round(parameters/1e6,2)
print(netname+' parameters: '+str(parameters)+'M')
def __patch_instance_norm_state_dict(state_dict, module, keys, i=0):
"""Fix InstanceNorm checkpoints incompatibility (prior to 0.4)"""
key = keys[i]
if i + 1 == len(keys): # at the end, pointing to a parameter/buffer
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'running_mean' or key == 'running_var'):
if getattr(module, key) is None:
state_dict.pop('.'.join(keys))
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'num_batches_tracked'):
state_dict.pop('.'.join(keys))
else:
__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1)
def pix2pix(opt):
# print(opt.model_path,opt.netG)
if opt.netG == 'HD':
netG = define_G_HD(3, 3, 64, 'global' ,4)
netG = pix2pixHD_G(3, 3, 64, 'global' ,4)
else:
netG = define_G(3, 3, 64, opt.netG, norm='batch',use_dropout=True, init_type='normal', gpu_ids=[])
netG = pix2pix_G(3, 3, 64, opt.netG, norm='batch',use_dropout=True, init_type='normal', gpu_ids=[])
show_paramsnumber(netG,'netG')
netG.load_state_dict(torch.load(opt.model_path))
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
if opt.use_gpu != -1:
netG.cuda()
return netG
def style(opt):
if opt.edges:
netG = define_G(1, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=True, init_type='normal', gpu_ids=[])
netG = pix2pix_G(1, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=True, init_type='normal', gpu_ids=[])
else:
netG = define_G(3, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=False, init_type='normal', gpu_ids=[])
netG = pix2pix_G(3, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=False, init_type='normal', gpu_ids=[])
#in other to load old pretrain model
#https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/models/base_model.py
......@@ -57,23 +43,19 @@ def style(opt):
# patch InstanceNorm checkpoints prior to 0.4
for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop
__patch_instance_norm_state_dict(state_dict, netG, key.split('.'))
model_util.patch_instance_norm_state_dict(state_dict, netG, key.split('.'))
netG.load_state_dict(state_dict)
if opt.use_gpu != -1:
netG.cuda()
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
return netG
def video(opt):
if 'HD' in opt.model_path:
netG = MosaicNet_HD(3*25+1, 3, norm='instance')
else:
netG = MosaicNet(3*25+1, 3,norm = 'batch')
netG = video_G(N=2,n_blocks=4,gpu_id=opt.gpu_id)
show_paramsnumber(netG,'netG')
netG.load_state_dict(torch.load(opt.model_path))
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
if opt.use_gpu != -1:
netG.cuda()
return netG
def bisenet(opt,type='roi'):
......@@ -86,7 +68,6 @@ def bisenet(opt,type='roi'):
net.load_state_dict(torch.load(opt.model_path))
elif type == 'mosaic':
net.load_state_dict(torch.load(opt.mosaic_position_model_path))
net = model_util.todevice(net,opt.gpu_id)
net.eval()
if opt.use_gpu != -1:
net.cuda()
return net
......@@ -7,11 +7,11 @@ from util import data
import torch
import numpy as np
def run_segment(img,net,size = 360,use_gpu = 0):
def run_segment(img,net,size = 360,gpu_id = '-1'):
img = impro.resize(img,size)
img = data.im2tensor(img,use_gpu = use_gpu, bgr2rgb = False,use_transform = False , is0_1 = True)
img = data.im2tensor(img,gpu_id = gpu_id, bgr2rgb = False, is0_1 = True)
mask = net(img)
mask = data.tensor2im(mask, gray=True,rgb2bgr = False, is0_1 = True)
mask = data.tensor2im(mask, gray=True, is0_1 = True)
return mask
def run_pix2pix(img,net,opt):
......@@ -19,7 +19,7 @@ def run_pix2pix(img,net,opt):
img = impro.resize(img,512)
else:
img = impro.resize(img,128)
img = data.im2tensor(img,use_gpu=opt.use_gpu)
img = data.im2tensor(img,gpu_id=opt.gpu_id)
img_fake = net(img)
img_fake = data.tensor2im(img_fake)
return img_fake
......@@ -50,18 +50,18 @@ def run_styletransfer(opt, net, img):
else:
canny_low = opt.canny-int(opt.canny/2)
canny_high = opt.canny+int(opt.canny/2)
img = cv2.Canny(img,opt.canny-50,opt.canny+50)
img = cv2.Canny(img,canny_low,canny_high)
if opt.only_edges:
return img
img = data.im2tensor(img,use_gpu=opt.use_gpu,gray=True,use_transform = False,is0_1 = False)
img = data.im2tensor(img,gpu_id=opt.gpu_id,gray=True)
else:
img = data.im2tensor(img,use_gpu=opt.use_gpu,gray=False,use_transform = True)
img = data.im2tensor(img,gpu_id=opt.gpu_id)
img = net(img)
img = data.tensor2im(img)
return img
def get_ROI_position(img,net,opt,keepsize=True):
mask = run_segment(img,net,size=360,use_gpu = opt.use_gpu)
mask = run_segment(img,net,size=360,gpu_id = opt.gpu_id)
mask = impro.mask_threshold(mask,opt.mask_extend,opt.mask_threshold)
if keepsize:
mask = impro.resize_like(mask, img)
......@@ -70,7 +70,7 @@ def get_ROI_position(img,net,opt,keepsize=True):
def get_mosaic_position(img_origin,net_mosaic_pos,opt):
h,w = img_origin.shape[:2]
mask = run_segment(img_origin,net_mosaic_pos,size=360,use_gpu = opt.use_gpu)
mask = run_segment(img_origin,net_mosaic_pos,size=360,gpu_id = opt.gpu_id)
# mask_1 = mask.copy()
mask = impro.mask_threshold(mask,ex_mun=int(min(h,w)/20),threshold=opt.mask_threshold)
if not opt.all_mosaic_area:
......
import torch
import torch.nn as nn
import torch.nn.functional as F
from .pix2pixHD_model import *
class encoder_2d(nn.Module):
def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
padding_type='reflect'):
assert(n_blocks >= 0)
super(encoder_2d, self).__init__()
activation = nn.ReLU(True)
model = [nn.ReflectionPad2d(3), nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0), norm_layer(ngf), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [nn.ReflectionPad2d(1),nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=0),
norm_layer(ngf * mult * 2), activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class decoder_2d(nn.Module):
def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
padding_type='reflect'):
assert(n_blocks >= 0)
super(decoder_2d, self).__init__()
activation = nn.ReLU(True)
model = []
### resnet blocks
mult = 2**n_downsampling
for i in range(n_blocks):
model += [ResnetBlock(ngf * mult, padding_type=padding_type, activation=activation, norm_layer=norm_layer)]
### upsample
for i in range(n_downsampling):
mult = 2**(n_downsampling - i)
# model += [ nn.Upsample(scale_factor = 2, mode='nearest'),
# nn.ReflectionPad2d(1),
# nn.Conv2d(ngf * mult, int(ngf * mult / 2),kernel_size=3, stride=1, padding=0),
# norm_layer(int(ngf * mult / 2)),
# nn.ReLU(True)]
model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1, output_padding=1),
norm_layer(int(ngf * mult / 2)), activation]
model += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0), nn.Tanh()]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class conv_3d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=2,padding=1,norm_layer_3d=nn.BatchNorm3d,use_bias=True):
super(conv_3d, self).__init__()
self.conv = nn.Sequential(
nn.Conv3d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=padding, bias=use_bias),
norm_layer_3d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class conv_2d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=1,padding=1,norm_layer_2d=nn.BatchNorm2d,use_bias=True):
super(conv_2d, self).__init__()
self.conv = nn.Sequential(
nn.ReflectionPad2d(padding),
nn.Conv2d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=0, bias=use_bias),
norm_layer_2d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class encoder_3d(nn.Module):
def __init__(self,in_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(encoder_3d, self).__init__()
self.inconv = conv_3d(1, 64, 7, 2, 3,norm_layer_3d,use_bias)
self.down1 = conv_3d(64, 128, 3, 2, 1,norm_layer_3d,use_bias)
self.down2 = conv_3d(128, 256, 3, 2, 1,norm_layer_3d,use_bias)
self.down3 = conv_3d(256, 512, 3, 2, 1,norm_layer_3d,use_bias)
self.down4 = conv_3d(512, 1024, 3, 1, 1,norm_layer_3d,use_bias)
self.pool = nn.AvgPool3d((5,1,1))
# self.conver2d = nn.Sequential(
# nn.Conv2d(256*int(in_channel/4), 256, kernel_size=3, stride=1, padding=1, bias=use_bias),
# norm_layer_2d(256),
# nn.ReLU(inplace=True),
# )
def forward(self, x):
x = x.view(x.size(0),1,x.size(1),x.size(2),x.size(3))
x = self.inconv(x)
x = self.down1(x)
x = self.down2(x)
x = self.down3(x)
x = self.down4(x)
#print(x.size())
x = self.pool(x)
#print(x.size())
# torch.Size([1, 1024, 16, 16])
# torch.Size([1, 512, 5, 16, 16])
x = x.view(x.size(0),x.size(1),x.size(3),x.size(4))
# x = self.conver2d(x)
return x
# def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
# padding_type='reflect')
class ALL(nn.Module):
def __init__(self, in_channel, out_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(ALL, self).__init__()
self.encoder_2d = encoder_2d(4,3,64,4,norm_layer=norm_layer_2d,padding_type='reflect')
self.encoder_3d = encoder_3d(in_channel,norm_layer_2d,norm_layer_3d,use_bias)
self.decoder_2d = decoder_2d(4,3,64,4,norm_layer=norm_layer_2d,padding_type='reflect')
# self.shortcut_cov = conv_2d(3,64,7,1,3,norm_layer_2d,use_bias)
self.merge1 = conv_2d(2048,1024,3,1,1,norm_layer_2d,use_bias)
# self.merge2 = nn.Sequential(
# conv_2d(128,64,3,1,1,norm_layer_2d,use_bias),
# nn.ReflectionPad2d(3),
# nn.Conv2d(64, out_channel, kernel_size=7, padding=0),
# nn.Tanh()
# )
def forward(self, x):
N = int((x.size()[1])/3)
x_2d = torch.cat((x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:], x[:,N-1:N,:,:]), 1)
#shortcut_2d = x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:]
x_2d = self.encoder_2d(x_2d)
x_3d = self.encoder_3d(x)
#x = x_2d + x_3d
x = torch.cat((x_2d,x_3d),1)
x = self.merge1(x)
x = self.decoder_2d(x)
#shortcut_2d = self.shortcut_cov(shortcut_2d)
#x = torch.cat((x,shortcut_2d),1)
#x = self.merge2(x)
return x
def MosaicNet(in_channel, out_channel, norm='batch'):
if norm == 'batch':
# norm_layer_2d = nn.BatchNorm2d
# norm_layer_3d = nn.BatchNorm3d
norm_layer_2d = functools.partial(nn.BatchNorm2d, affine=True, track_running_stats=True)
norm_layer_3d = functools.partial(nn.BatchNorm3d, affine=True, track_running_stats=True)
use_bias = False
elif norm == 'instance':
norm_layer_2d = functools.partial(nn.InstanceNorm2d, affine=False, track_running_stats=False)
norm_layer_3d = functools.partial(nn.InstanceNorm3d, affine=False, track_running_stats=False)
use_bias = True
return ALL(in_channel, out_channel, norm_layer_2d, norm_layer_3d, use_bias)
import torch
import torch.nn as nn
import torch.nn.functional as F
from .pix2pix_model import *
class encoder_2d(nn.Module):
"""Resnet-based generator that consists of Resnet blocks between a few downsampling/upsampling operations.
We adapt Torch code and idea from Justin Johnson's neural style transfer project(https://github.com/jcjohnson/fast-neural-style)
"""
def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
"""Construct a Resnet-based generator
Parameters:
input_nc (int) -- the number of channels in input images
output_nc (int) -- the number of channels in output images
ngf (int) -- the number of filters in the last conv layer
norm_layer -- normalization layer
use_dropout (bool) -- if use dropout layers
n_blocks (int) -- the number of ResNet blocks
padding_type (str) -- the name of padding layer in conv layers: reflect | replicate | zero
"""
assert(n_blocks >= 0)
super(encoder_2d, self).__init__()
if type(norm_layer) == functools.partial:
use_bias = norm_layer.func == nn.InstanceNorm2d
else:
use_bias = norm_layer == nn.InstanceNorm2d
model = [nn.ReflectionPad2d(3),
nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0, bias=use_bias),
norm_layer(ngf),
nn.ReLU(True)]
n_downsampling = 2
for i in range(n_downsampling): # add downsampling layers
mult = 2 ** i
model += [nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1, bias=use_bias),
norm_layer(ngf * mult * 2),
nn.ReLU(True)]
#torch.Size([1, 256, 32, 32])
self.model = nn.Sequential(*model)
def forward(self, input):
"""Standard forward"""
return self.model(input)
class decoder_2d(nn.Module):
"""Resnet-based generator that consists of Resnet blocks between a few downsampling/upsampling operations.
We adapt Torch code and idea from Justin Johnson's neural style transfer project(https://github.com/jcjohnson/fast-neural-style)
"""
def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
"""Construct a Resnet-based generator
Parameters:
input_nc (int) -- the number of channels in input images
output_nc (int) -- the number of channels in output images
ngf (int) -- the number of filters in the last conv layer
norm_layer -- normalization layer
use_dropout (bool) -- if use dropout layers
n_blocks (int) -- the number of ResNet blocks
padding_type (str) -- the name of padding layer in conv layers: reflect | replicate | zero
"""
super(decoder_2d, self).__init__()
if type(norm_layer) == functools.partial:
use_bias = norm_layer.func == nn.InstanceNorm2d
else:
use_bias = norm_layer == nn.InstanceNorm2d
model = []
n_downsampling = 2
mult = 2 ** n_downsampling
for i in range(n_blocks): # add ResNet blocks
model += [ResnetBlock(ngf * mult, padding_type=padding_type, norm_layer=norm_layer, use_dropout=use_dropout, use_bias=use_bias)]
#torch.Size([1, 256, 32, 32])
for i in range(n_downsampling): # add upsampling layers
mult = 2 ** (n_downsampling - i)
# model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2),
# kernel_size=3, stride=2,
# padding=1, output_padding=1,
# bias=use_bias),
# norm_layer(int(ngf * mult / 2)),
# nn.ReLU(True)]
#https://distill.pub/2016/deconv-checkerboard/
#https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/190
model += [ nn.Upsample(scale_factor = 2, mode='nearest'),
nn.ReflectionPad2d(1),
nn.Conv2d(ngf * mult, int(ngf * mult / 2),kernel_size=3, stride=1, padding=0),
norm_layer(int(ngf * mult / 2)),
nn.ReLU(True)]
# model += [nn.ReflectionPad2d(3)]
# model += [nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0)]
# model += [nn.Tanh()]
# model += [nn.Sigmoid()]
self.model = nn.Sequential(*model)
def forward(self, input):
"""Standard forward"""
return self.model(input)
class conv_3d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=2,padding=1,norm_layer_3d=nn.BatchNorm3d,use_bias=True):
super(conv_3d, self).__init__()
self.conv = nn.Sequential(
nn.Conv3d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=padding, bias=use_bias),
norm_layer_3d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class conv_2d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=1,padding=1,norm_layer_2d=nn.BatchNorm2d,use_bias=True):
super(conv_2d, self).__init__()
self.conv = nn.Sequential(
nn.ReflectionPad2d(padding),
nn.Conv2d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=0, bias=use_bias),
norm_layer_2d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class encoder_3d(nn.Module):
def __init__(self,in_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(encoder_3d, self).__init__()
self.down1 = conv_3d(1, 64, 3, 2, 1,norm_layer_3d,use_bias)
self.down2 = conv_3d(64, 128, 3, 2, 1,norm_layer_3d,use_bias)
self.down3 = conv_3d(128, 256, 3, 1, 1,norm_layer_3d,use_bias)
self.conver2d = nn.Sequential(
nn.Conv2d(256*int(in_channel/4), 256, kernel_size=3, stride=1, padding=1, bias=use_bias),
norm_layer_2d(256),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = x.view(x.size(0),1,x.size(1),x.size(2),x.size(3))
x = self.down1(x)
x = self.down2(x)
x = self.down3(x)
x = x.view(x.size(0),x.size(1)*x.size(2),x.size(3),x.size(4))
x = self.conver2d(x)
return x
class ALL(nn.Module):
def __init__(self, in_channel, out_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(ALL, self).__init__()
self.encoder_2d = encoder_2d(4,-1,64,norm_layer=norm_layer_2d,n_blocks=9)
self.encoder_3d = encoder_3d(in_channel,norm_layer_2d,norm_layer_3d,use_bias)
self.decoder_2d = decoder_2d(4,3,64,norm_layer=norm_layer_2d,n_blocks=9)
self.shortcut_cov = conv_2d(3,64,7,1,3,norm_layer_2d,use_bias)
self.merge1 = conv_2d(512,256,3,1,1,norm_layer_2d,use_bias)
self.merge2 = nn.Sequential(
conv_2d(128,64,3,1,1,norm_layer_2d,use_bias),
nn.ReflectionPad2d(3),
nn.Conv2d(64, out_channel, kernel_size=7, padding=0),
nn.Tanh()
)
def forward(self, x):
N = int((x.size()[1])/3)
x_2d = torch.cat((x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:], x[:,N-1:N,:,:]), 1)
shortcut_2d = x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:]
x_2d = self.encoder_2d(x_2d)
x_3d = self.encoder_3d(x)
x = torch.cat((x_2d,x_3d),1)
x = self.merge1(x)
x = self.decoder_2d(x)
shortcut_2d = self.shortcut_cov(shortcut_2d)
x = torch.cat((x,shortcut_2d),1)
x = self.merge2(x)
return x
def MosaicNet(in_channel, out_channel, norm='batch'):
if norm == 'batch':
# norm_layer_2d = nn.BatchNorm2d
# norm_layer_3d = nn.BatchNorm3d
norm_layer_2d = functools.partial(nn.BatchNorm2d, affine=True, track_running_stats=True)
norm_layer_3d = functools.partial(nn.BatchNorm3d, affine=True, track_running_stats=True)
use_bias = False
elif norm == 'instance':
norm_layer_2d = functools.partial(nn.InstanceNorm2d, affine=False, track_running_stats=False)
norm_layer_3d = functools.partial(nn.InstanceNorm3d, affine=False, track_running_stats=False)
use_bias = True
return ALL(in_channel, out_channel, norm_layer_2d, norm_layer_3d, use_bias)
opencv_python==4.5.1.48
numpy==1.19.2
torchvision==0.8.2
torch==1.7.1
matplotlib==3.3.2
tensorboardX==2.2
scikit-image==0.17.2
\ No newline at end of file
import os
import sys
import traceback
import cv2
import numpy as np
try:
from cores import Options,clean
from util import util
from util import image_processing as impro
from models import loadmodel
except Exception as e:
print(e)
input('Please press any key to exit.\n')
sys.exit(0)
# python server.py --gpu_id 0 --model_path ./pretrained_models/mosaic/clean_face_HD.pth
opt = Options()
opt.parser.add_argument('--port',type=int,default=4000, help='')
opt = opt.getparse(True)
netM = loadmodel.bisenet(opt,'mosaic')
netG = loadmodel.pix2pix(opt)
from flask import Flask, request
import base64
import shutil
app = Flask(__name__)
@app.route("/handle", methods=["POST"])
def handle():
result = {}
# to opencv img
try:
imgRec = request.form['img']
imgByte = base64.b64decode(imgRec)
img_np_arr = np.frombuffer(imgByte, np.uint8)
img = cv2.imdecode(img_np_arr, cv2.IMREAD_COLOR)
except Exception as e:
result['img'] = imgRec
result['info'] = 'readfailed'
return result
# run model
try:
if max(img.shape)>1080:
img = impro.resize(img,720,interpolation=cv2.INTER_CUBIC)
img = clean.cleanmosaic_img_server(opt,img,netG,netM)
except Exception as e:
result['img'] = imgRec
result['info'] = 'procfailed'
return result
# return
imgbytes = cv2.imencode('.jpg', img)[1]
imgString = base64.b64encode(imgbytes).decode('utf-8')
result['img'] = imgString
result['info'] = 'ok'
return result
app.run("0.0.0.0", port= opt.port, debug=opt.debug)
\ No newline at end of file
import os
import sys
import traceback
sys.path.append("..")
from util import mosaic
import torch
try:
from cores import Options,add,clean,style
from util import util
from models import loadmodel
except Exception as e:
print(e)
input('Please press any key to exit.\n')
sys.exit(0)
opt = Options().getparse(test_flag = False)
if not os.path.isdir(opt.temp_dir):
util.file_init(opt)
def saveScriptModel(model,example,savepath):
model.cpu()
traced_script_module = torch.jit.trace(model, example)
# try ScriptModel
output = traced_script_module(example)
print(output)
traced_script_module.save(savepath)
savedir = '../cpp/res/models/'
util.makedirs(savedir)
opt.mosaic_position_model_path = '../pretrained_models/mosaic/mosaic_position.pth'
model = loadmodel.bisenet(opt,'mosaic')
example = torch.ones((1,3,360,360))
saveScriptModel(model,example,os.path.join(savedir,'mosaic_position.pt'))
# def main():
# if os.path.isdir(opt.media_path):
# files = util.Traversal(opt.media_path)
# else:
# files = [opt.media_path]
# if opt.mode == 'add':
# netS = loadmodel.bisenet(opt,'roi')
# for file in files:
# opt.media_path = file
# if util.is_img(file):
# add.addmosaic_img(opt,netS)
# elif util.is_video(file):
# add.addmosaic_video(opt,netS)
# util.clean_tempfiles(opt, tmp_init = False)
# else:
# print('This type of file is not supported')
# util.clean_tempfiles(opt, tmp_init = False)
# elif opt.mode == 'clean':
# netM = loadmodel.bisenet(opt,'mosaic')
# if opt.traditional:
# netG = None
# elif opt.netG == 'video':
# netG = loadmodel.video(opt)
# else:
# netG = loadmodel.pix2pix(opt)
# for file in files:
# opt.media_path = file
# if util.is_img(file):
# clean.cleanmosaic_img(opt,netG,netM)
# elif util.is_video(file):
# if opt.netG == 'video' and not opt.traditional:
# clean.cleanmosaic_video_fusion(opt,netG,netM)
# else:
# clean.cleanmosaic_video_byframe(opt,netG,netM)
# util.clean_tempfiles(opt, tmp_init = False)
# else:
# print('This type of file is not supported')
# elif opt.mode == 'style':
# netG = loadmodel.style(opt)
# for file in files:
# opt.media_path = file
# if util.is_img(file):
# style.styletransfer_img(opt,netG)
# elif util.is_video(file):
# style.styletransfer_video(opt,netG)
# util.clean_tempfiles(opt, tmp_init = False)
# else:
# print('This type of file is not supported')
# util.clean_tempfiles(opt, tmp_init = False)
# if __name__ == '__main__':
# main()
\ No newline at end of file
......@@ -54,10 +54,10 @@ util.makedirs(dir_checkpoint)
util.writelog(os.path.join(dir_checkpoint,'loss.txt'),
str(time.asctime(time.localtime(time.time())))+'\n'+util.opt2str(opt))
def Totensor(img,use_gpu=True):
def Totensor(img,gpu_id=True):
size=img.shape[0]
img = torch.from_numpy(img).float()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
img = img.cuda()
return img
......@@ -68,11 +68,11 @@ def loadimage(imagepaths,maskpaths,opt,test_flag = False):
for i in range(len(imagepaths)):
img = impro.resize(impro.imread(imagepaths[i]),opt.loadsize)
mask = impro.resize(impro.imread(maskpaths[i],mod = 'gray'),opt.loadsize)
img,mask = data.random_transform_image(img, mask, opt.finesize, test_flag)
img,mask = data.random_transform_pair_image(img, mask, opt.finesize, test_flag)
images[i] = (img.transpose((2, 0, 1))/255.0)
masks[i] = (mask.reshape(1,1,opt.finesize,opt.finesize)/255.0)
images = Totensor(images,opt.use_gpu)
masks = Totensor(masks,opt.use_gpu)
images = data.to_tensor(images,opt.gpu_id)
masks = data.to_tensor(masks,opt.gpu_id)
return images,masks
......@@ -111,7 +111,7 @@ if opt.continue_train:
f = open(os.path.join(dir_checkpoint,'epoch_log.txt'),'r')
opt.startepoch = int(f.read())
f.close()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
net.cuda()
cudnn.benchmark = True
......@@ -135,7 +135,7 @@ for epoch in range(opt.startepoch,opt.maxepoch):
starttime = datetime.datetime.now()
util.writelog(os.path.join(dir_checkpoint,'loss.txt'),'Epoch {}/{}.'.format(epoch + 1, opt.maxepoch),True)
net.train()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
net.cuda()
epoch_loss = 0
for i in range(int(img_num*0.8/opt.batchsize)):
......
此差异已折叠。
import random
import os
from util.mosaic import get_random_parameter
import numpy as np
import torch
import torchvision.transforms as transforms
import cv2
from . import image_processing as impro
from . import mosaic
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5))
]
)
def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 = False):
from . import degradater
def to_tensor(data,gpu_id):
data = torch.from_numpy(data)
if gpu_id != '-1':
data = data.cuda()
return data
def normalize(data):
'''
normalize to -1 ~ 1
'''
return (data.astype(np.float32)/255.0-0.5)/0.5
def anti_normalize(data):
return np.clip((data*0.5+0.5)*255,0,255).astype(np.uint8)
def tensor2im(image_tensor, gray=False, rgb2bgr = True ,is0_1 = False, batch_index=0):
image_tensor =image_tensor.data
image_numpy = image_tensor[0].cpu().float().numpy()
image_numpy = image_tensor[batch_index].cpu().float().numpy()
if not is0_1:
image_numpy = (image_numpy + 1)/2.0
......@@ -24,7 +35,7 @@ def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 =
if gray:
h, w = image_numpy.shape[1:]
image_numpy = image_numpy.reshape(h,w)
return image_numpy.astype(imtype)
return image_numpy.astype(np.uint8)
# output 3ch
if image_numpy.shape[0] == 1:
......@@ -32,11 +43,10 @@ def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 =
image_numpy = image_numpy.transpose((1, 2, 0))
if rgb2bgr and not gray:
image_numpy = image_numpy[...,::-1]-np.zeros_like(image_numpy)
return image_numpy.astype(imtype)
return image_numpy.astype(np.uint8)
def im2tensor(image_numpy, imtype=np.uint8, gray=False,bgr2rgb = True, reshape = True, use_gpu = 0, use_transform = True,is0_1 = True):
def im2tensor(image_numpy, gray=False,bgr2rgb = True, reshape = True, gpu_id = '-1',is0_1 = False):
if gray:
h, w = image_numpy.shape
image_numpy = (image_numpy/255.0-0.5)/0.5
......@@ -47,18 +57,15 @@ def im2tensor(image_numpy, imtype=np.uint8, gray=False,bgr2rgb = True, reshape =
h, w ,ch = image_numpy.shape
if bgr2rgb:
image_numpy = image_numpy[...,::-1]-np.zeros_like(image_numpy)
if use_transform:
image_tensor = transform(image_numpy)
if is0_1:
image_numpy = image_numpy/255.0
else:
if is0_1:
image_numpy = image_numpy/255.0
else:
image_numpy = (image_numpy/255.0-0.5)/0.5
image_numpy = image_numpy.transpose((2, 0, 1))
image_tensor = torch.from_numpy(image_numpy).float()
image_numpy = (image_numpy/255.0-0.5)/0.5
image_numpy = image_numpy.transpose((2, 0, 1))
image_tensor = torch.from_numpy(image_numpy).float()
if reshape:
image_tensor = image_tensor.reshape(1,ch,h,w)
if use_gpu != -1:
if gpu_id != '-1':
image_tensor = image_tensor.cuda()
return image_tensor
......@@ -68,53 +75,7 @@ def shuffledata(data,target):
np.random.set_state(state)
np.random.shuffle(target)
def random_transform_video(src,target,finesize,N):
#random blur
if random.random()<0.2:
h,w = src.shape[:2]
src = src[:8*(h//8),:8*(w//8)]
Q_ran = random.randint(1,15)
src[:,:,:3*N] = impro.dctblur(src[:,:,:3*N],Q_ran)
target = impro.dctblur(target,Q_ran)
#random crop
h,w = target.shape[:2]
h_move = int((h-finesize)*random.random())
w_move = int((w-finesize)*random.random())
target = target[h_move:h_move+finesize,w_move:w_move+finesize,:]
src = src[h_move:h_move+finesize,w_move:w_move+finesize,:]
#random flip
if random.random()<0.5:
src = src[:,::-1,:]
target = target[:,::-1,:]
#random color
alpha = random.uniform(-0.1,0.1)
beta = random.uniform(-0.1,0.1)
b = random.uniform(-0.05,0.05)
g = random.uniform(-0.05,0.05)
r = random.uniform(-0.05,0.05)
for i in range(N):
src[:,:,i*3:(i+1)*3] = impro.color_adjust(src[:,:,i*3:(i+1)*3],alpha,beta,b,g,r)
target = impro.color_adjust(target,alpha,beta,b,g,r)
#random resize blur
if random.random()<0.5:
interpolations = [cv2.INTER_LINEAR,cv2.INTER_CUBIC,cv2.INTER_LANCZOS4]
size_ran = random.uniform(0.7,1.5)
interpolation_up = interpolations[random.randint(0,2)]
interpolation_down =interpolations[random.randint(0,2)]
tmp = cv2.resize(src[:,:,:3*N], (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolation_up)
src[:,:,:3*N] = cv2.resize(tmp, (finesize,finesize),interpolation=interpolation_down)
tmp = cv2.resize(target, (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolation_up)
target = cv2.resize(tmp, (finesize,finesize),interpolation=interpolation_down)
return src,target
def random_transform_single(img,out_shape):
def random_transform_single_mask(img,out_shape):
out_h,out_w = out_shape
img = cv2.resize(img,(int(out_w*random.uniform(1.1, 1.5)),int(out_h*random.uniform(1.1, 1.5))))
h,w = img.shape[:2]
......@@ -130,90 +91,65 @@ def random_transform_single(img,out_shape):
img = cv2.resize(img,(out_w,out_h))
return img
def random_transform_image(img,mask,finesize,test_flag = False):
#random scale
if random.random()<0.5:
h,w = img.shape[:2]
loadsize = min((h,w))
a = (float(h)/float(w))*random.uniform(0.9, 1.1)
if h<w:
mask = cv2.resize(mask, (int(loadsize/a),loadsize))
img = cv2.resize(img, (int(loadsize/a),loadsize))
else:
mask = cv2.resize(mask, (loadsize,int(loadsize*a)))
img = cv2.resize(img, (loadsize,int(loadsize*a)))
#random crop
h,w = img.shape[:2]
h_move = int((h-finesize)*random.random())
w_move = int((w-finesize)*random.random())
img_crop = img[h_move:h_move+finesize,w_move:w_move+finesize]
mask_crop = mask[h_move:h_move+finesize,w_move:w_move+finesize]
def get_transform_params():
crop_flag = True
rotat_flag = np.random.random()<0.2
color_flag = True
flip_flag = np.random.random()<0.2
degradate_flag = np.random.random()<0.5
flag_dict = {'crop':crop_flag,'rotat':rotat_flag,'color':color_flag,'flip':flip_flag,'degradate':degradate_flag}
crop_rate = [np.random.random(),np.random.random()]
rotat_rate = np.random.random()
color_rate = [np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05),
np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05)]
flip_rate = np.random.random()
degradate_params = degradater.get_random_degenerate_params(mod='weaker_2')
rate_dict = {'crop':crop_rate,'rotat':rotat_rate,'color':color_rate,'flip':flip_rate,'degradate':degradate_params}
return {'flag':flag_dict,'rate':rate_dict}
def random_transform_single_image(img,finesize,params=None,test_flag = False):
if params is None:
params = get_transform_params()
if params['flag']['degradate']:
img = degradater.degradate(img,params['rate']['degradate'])
if test_flag:
return img_crop,mask_crop
if params['flag']['crop']:
h,w = img.shape[:2]
h_move = int((h-finesize)*params['rate']['crop'][0])
w_move = int((w-finesize)*params['rate']['crop'][1])
img = img[h_move:h_move+finesize,w_move:w_move+finesize]
#random rotation
if random.random()<0.2:
h,w = img_crop.shape[:2]
M = cv2.getRotationMatrix2D((w/2,h/2),90*int(4*random.random()),1)
img = cv2.warpAffine(img_crop,M,(w,h))
mask = cv2.warpAffine(mask_crop,M,(w,h))
else:
img,mask = img_crop,mask_crop
if test_flag:
return img
#random color
img = impro.color_adjust(img,ran=True)
if params['flag']['rotat']:
h,w = img.shape[:2]
M = cv2.getRotationMatrix2D((w/2,h/2),90*int(4*params['rate']['rotat']),1)
img = cv2.warpAffine(img,M,(w,h))
#random flip
if random.random()<0.5:
if random.random()<0.5:
img = img[:,::-1,:]
mask = mask[:,::-1]
else:
img = img[::-1,:,:]
mask = mask[::-1,:]
if params['flag']['color']:
img = impro.color_adjust(img,params['rate']['color'][0],params['rate']['color'][1],
params['rate']['color'][2],params['rate']['color'][3],params['rate']['color'][4])
if params['flag']['flip']:
img = img[:,::-1]
#random blur
if random.random()<0.5:
img = impro.dctblur(img,random.randint(1,15))
# interpolations = [cv2.INTER_LINEAR,cv2.INTER_CUBIC,cv2.INTER_LANCZOS4]
# size_ran = random.uniform(0.7,1.5)
# img = cv2.resize(img, (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolations[random.randint(0,2)])
# img = cv2.resize(img, (finesize,finesize),interpolation=interpolations[random.randint(0,2)])
#check shape
if img.shape[0]!= finesize or img.shape[1]!= finesize or mask.shape[0]!= finesize or mask.shape[1]!= finesize:
if img.shape[0]!= finesize or img.shape[1]!= finesize:
img = cv2.resize(img,(finesize,finesize))
mask = cv2.resize(mask,(finesize,finesize))
print('warning! shape error.')
return img,mask
def load_train_video(videoname,img_index,opt):
N = opt.N
input_img = np.zeros((opt.loadsize,opt.loadsize,3*N+1), dtype='uint8')
# this frame
this_mask = impro.imread(os.path.join(opt.dataset,videoname,'mask','%05d'%(img_index)+'.png'),'gray',loadsize=opt.loadsize)
input_img[:,:,-1] = this_mask
#print(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index)+'.jpg'))
ground_true = impro.imread(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index)+'.jpg'),loadsize=opt.loadsize)
mosaic_size,mod,rect_rat,feather = mosaic.get_random_parameter(ground_true,this_mask)
start_pos = mosaic.get_random_startpos(num=N,bisa_p=0.3,bisa_max=mosaic_size,bisa_max_part=3)
# merge other frame
for i in range(0,N):
img = impro.imread(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index+i-int(N/2))+'.jpg'),loadsize=opt.loadsize)
mask = impro.imread(os.path.join(opt.dataset,videoname,'mask','%05d'%(img_index+i-int(N/2))+'.png'),'gray',loadsize=opt.loadsize)
img_mosaic = mosaic.addmosaic_base(img, mask, mosaic_size,model = mod,rect_rat=rect_rat,feather=feather,start_point=start_pos[i])
input_img[:,:,i*3:(i+1)*3] = img_mosaic
# to tensor
input_img,ground_true = random_transform_video(input_img,ground_true,opt.finesize,N)
input_img = im2tensor(input_img,bgr2rgb=False,use_gpu=-1,use_transform = False,is0_1=False)
ground_true = im2tensor(ground_true,bgr2rgb=False,use_gpu=-1,use_transform = False,is0_1=False)
return input_img,ground_true
return img
def random_transform_pair_image(img,mask,finesize,test_flag = False):
params = get_transform_params()
img = random_transform_single_image(img,finesize,params)
params['flag']['degradate'] = False
params['flag']['color'] = False
mask = random_transform_single_image(mask,finesize,params)
return img,mask
def showresult(img1,img2,img3,name,is0_1 = False):
size = img1.shape[3]
......
import os
import random
import numpy as np
from multiprocessing import Process, Queue
from . import image_processing as impro
from . import mosaic,data
class VideoLoader(object):
"""docstring for VideoLoader
Load a single video(Converted to images)
How to use:
1.Init VideoLoader as loader
2.Get data by loader.ori_stream
3.loader.next() to get next stream
"""
def __init__(self, opt, video_dir, test_flag=False):
super(VideoLoader, self).__init__()
self.opt = opt
self.test_flag = test_flag
self.video_dir = video_dir
self.t = 0
self.n_iter = self.opt.M -self.opt.S*(self.opt.T+1)
self.transform_params = data.get_transform_params()
self.ori_load_pool = []
self.mosaic_load_pool = []
self.previous_pred = None
feg_ori = impro.imread(os.path.join(video_dir,'origin_image','00001.jpg'),loadsize=self.opt.loadsize,rgb=True)
feg_mask = impro.imread(os.path.join(video_dir,'mask','00001.png'),mod='gray',loadsize=self.opt.loadsize)
self.mosaic_size,self.mod,self.rect_rat,self.feather = mosaic.get_random_parameter(feg_ori,feg_mask)
self.startpos = [random.randint(0,self.mosaic_size),random.randint(0,self.mosaic_size)]
self.loadsize = self.opt.loadsize
#Init load pool
for i in range(self.opt.S*self.opt.T):
_ori_img = impro.imread(os.path.join(video_dir,'origin_image','%05d' % (i+1)+'.jpg'),loadsize=self.loadsize,rgb=True)
_mask = impro.imread(os.path.join(video_dir,'mask','%05d' % (i+1)+'.png' ),mod='gray',loadsize=self.loadsize)
_mosaic_img = mosaic.addmosaic_base(_ori_img, _mask, self.mosaic_size,0, self.mod,self.rect_rat,self.feather,self.startpos)
_ori_img = data.random_transform_single_image(_ori_img,opt.finesize,self.transform_params)
_mosaic_img = data.random_transform_single_image(_mosaic_img,opt.finesize,self.transform_params)
self.ori_load_pool.append(self.normalize(_ori_img))
self.mosaic_load_pool.append(self.normalize(_mosaic_img))
self.ori_load_pool = np.array(self.ori_load_pool)
self.mosaic_load_pool = np.array(self.mosaic_load_pool)
#Init frist stream
self.ori_stream = self.ori_load_pool [np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
self.mosaic_stream = self.mosaic_load_pool[np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
# stream B,T,H,W,C -> B,C,T,H,W
self.ori_stream = self.ori_stream.reshape (1,self.opt.T,opt.finesize,opt.finesize,3).transpose((0,4,1,2,3))
self.mosaic_stream = self.mosaic_stream.reshape(1,self.opt.T,opt.finesize,opt.finesize,3).transpose((0,4,1,2,3))
#Init frist previous frame
self.previous_pred = self.ori_load_pool[self.opt.S*self.opt.N-1].copy()
# previous B,C,H,W
self.previous_pred = self.previous_pred.reshape(1,opt.finesize,opt.finesize,3).transpose((0,3,1,2))
def normalize(self,data):
'''
normalize to -1 ~ 1
'''
return (data.astype(np.float32)/255.0-0.5)/0.5
def anti_normalize(self,data):
return np.clip((data*0.5+0.5)*255,0,255).astype(np.uint8)
def next(self):
# random
if np.random.random()<0.05:
self.startpos = [random.randint(0,self.mosaic_size),random.randint(0,self.mosaic_size)]
if np.random.random()<0.02:
self.transform_params['rate']['crop'] = [np.random.random(),np.random.random()]
if np.random.random()<0.02:
self.loadsize = np.random.randint(self.opt.finesize,self.opt.loadsize)
if self.t != 0:
self.previous_pred = None
self.ori_load_pool [:self.opt.S*self.opt.T-1] = self.ori_load_pool [1:self.opt.S*self.opt.T]
self.mosaic_load_pool[:self.opt.S*self.opt.T-1] = self.mosaic_load_pool[1:self.opt.S*self.opt.T]
#print(os.path.join(self.video_dir,'origin_image','%05d' % (self.opt.S*self.opt.T+self.t)+'.jpg'))
_ori_img = impro.imread(os.path.join(self.video_dir,'origin_image','%05d' % (self.opt.S*self.opt.T+self.t)+'.jpg'),loadsize=self.loadsize,rgb=True)
_mask = impro.imread(os.path.join(self.video_dir,'mask','%05d' % (self.opt.S*self.opt.T+self.t)+'.png' ),mod='gray',loadsize=self.loadsize)
_mosaic_img = mosaic.addmosaic_base(_ori_img, _mask, self.mosaic_size,0, self.mod,self.rect_rat,self.feather,self.startpos)
_ori_img = data.random_transform_single_image(_ori_img,self.opt.finesize,self.transform_params)
_mosaic_img = data.random_transform_single_image(_mosaic_img,self.opt.finesize,self.transform_params)
_ori_img,_mosaic_img = self.normalize(_ori_img),self.normalize(_mosaic_img)
self.ori_load_pool [self.opt.S*self.opt.T-1] = _ori_img
self.mosaic_load_pool[self.opt.S*self.opt.T-1] = _mosaic_img
self.ori_stream = self.ori_load_pool [np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
self.mosaic_stream = self.mosaic_load_pool[np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
# stream B,T,H,W,C -> B,C,T,H,W
self.ori_stream = self.ori_stream.reshape (1,self.opt.T,self.opt.finesize,self.opt.finesize,3).transpose((0,4,1,2,3))
self.mosaic_stream = self.mosaic_stream.reshape(1,self.opt.T,self.opt.finesize,self.opt.finesize,3).transpose((0,4,1,2,3))
self.t += 1
class VideoDataLoader(object):
"""VideoDataLoader"""
def __init__(self, opt, videolist, test_flag=False):
super(VideoDataLoader, self).__init__()
self.videolist = []
self.opt = opt
self.test_flag = test_flag
for i in range(self.opt.n_epoch):
self.videolist += videolist.copy()
random.shuffle(self.videolist)
self.each_video_n_iter = self.opt.M -self.opt.S*(self.opt.T+1)
self.n_iter = len(self.videolist)//self.opt.load_thread//self.opt.batchsize*self.each_video_n_iter*self.opt.load_thread
self.queue = Queue(self.opt.load_thread)
self.ori_stream = np.zeros((self.opt.batchsize,3,self.opt.T,self.opt.finesize,self.opt.finesize),dtype=np.float32)# B,C,T,H,W
self.mosaic_stream = np.zeros((self.opt.batchsize,3,self.opt.T,self.opt.finesize,self.opt.finesize),dtype=np.float32)# B,C,T,H,W
self.previous_pred = np.zeros((self.opt.batchsize,3,self.opt.finesize,self.opt.finesize),dtype=np.float32)
self.load_init()
def load(self,videolist):
for load_video_iter in range(len(videolist)//self.opt.batchsize):
iter_videolist = videolist[load_video_iter*self.opt.batchsize:(load_video_iter+1)*self.opt.batchsize]
videoloaders = [VideoLoader(self.opt,os.path.join(self.opt.dataset,iter_videolist[i]),self.test_flag) for i in range(self.opt.batchsize)]
for each_video_iter in range(self.each_video_n_iter):
for i in range(self.opt.batchsize):
self.ori_stream[i] = videoloaders[i].ori_stream
self.mosaic_stream[i] = videoloaders[i].mosaic_stream
if each_video_iter == 0:
self.previous_pred[i] = videoloaders[i].previous_pred
videoloaders[i].next()
if each_video_iter == 0:
self.queue.put([self.ori_stream.copy(),self.mosaic_stream.copy(),self.previous_pred])
else:
self.queue.put([self.ori_stream.copy(),self.mosaic_stream.copy(),None])
def load_init(self):
ptvn = len(self.videolist)//self.opt.load_thread #pre_thread_video_num
for i in range(self.opt.load_thread):
p = Process(target=self.load,args=(self.videolist[i*ptvn:(i+1)*ptvn],))
p.daemon = True
p.start()
def get_data(self):
return self.queue.get()
\ No newline at end of file
'''
https://github.com/sonack/GFRNet_pytorch_new
'''
import random
import cv2
import numpy as np
def gaussian_blur(img, sigma=3, size=13):
if sigma > 0:
if isinstance(size, int):
size = (size, size)
img = cv2.GaussianBlur(img, size, sigma)
return img
def down(img, scale, shape):
if scale > 1:
h, w, _ = shape
scaled_h, scaled_w = int(h / scale), int(w / scale)
img = cv2.resize(img, (scaled_w, scaled_h), interpolation = cv2.INTER_CUBIC)
return img
def up(img, scale, shape):
if scale > 1:
h, w, _ = shape
img = cv2.resize(img, (w, h), interpolation = cv2.INTER_CUBIC)
return img
def awgn(img, level):
if level > 0:
noise = np.random.randn(*img.shape) * level
img = (img + noise).clip(0,255).astype(np.uint8)
return img
def jpeg_compressor(img,quality):
if quality > 0: # 0 indicating no lossy compression (i.e losslessly compression)
encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), quality]
img = cv2.imdecode(cv2.imencode('.jpg', img, encode_param)[1], 1)
return img
def get_random_degenerate_params(mod='strong'):
'''
mod : strong | only_downsample | only_4x | weaker_1 | weaker_2
'''
params = {}
gaussianBlur_size_list = list(range(3,14,2))
if mod == 'strong':
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += [0]
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
awgn_level_list = list(range(1, 8, 1))
jpeg_quality_list = list(range(10, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list) * 0.33) * [0]
elif mod == 'only_downsample':
gaussianBlur_sigma_list = [0]
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
awgn_level_list = [0]
jpeg_quality_list = [0]
elif mod == 'only_4x':
gaussianBlur_sigma_list = [0]
downsample_scale_list = [4]
awgn_level_list = [0]
jpeg_quality_list = [0]
elif mod == 'weaker_1': # 0.5 trigger prob
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += int(len(gaussianBlur_sigma_list)) * [0] # 1/2 trigger this degradation
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
downsample_scale_list += int(len(downsample_scale_list)) * [1]
awgn_level_list = list(range(1, 8, 1))
awgn_level_list += int(len(awgn_level_list)) * [0]
jpeg_quality_list = list(range(10, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list)) * [0]
elif mod == 'weaker_2': # weaker than weaker_1, jpeg [20,40]
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += int(len(gaussianBlur_sigma_list)) * [0] # 1/2 trigger this degradation
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
downsample_scale_list += int(len(downsample_scale_list)) * [1]
awgn_level_list = list(range(1, 8, 1))
awgn_level_list += int(len(awgn_level_list)) * [0]
jpeg_quality_list = list(range(20, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list)) * [0]
params['blur_sigma'] = random.choice(gaussianBlur_sigma_list)
params['blur_size'] = random.choice(gaussianBlur_size_list)
params['updown_scale'] = random.choice(downsample_scale_list)
params['awgn_level'] = random.choice(awgn_level_list)
params['jpeg_quality'] = random.choice(jpeg_quality_list)
return params
def degradate(img,params,jpeg_last = True):
shape = img.shape
if not params:
params = get_random_degenerate_params('original')
if jpeg_last:
img = gaussian_blur(img,params['blur_sigma'],params['blur_size'])
img = down(img,params['updown_scale'],shape)
img = awgn(img,params['awgn_level'])
img = up(img,params['updown_scale'],shape)
img = jpeg_compressor(img,params['jpeg_quality'])
else:
img = gaussian_blur(img,params['blur_sigma'],params['blur_size'])
img = down(img,params['updown_scale'],shape)
img = awgn(img,params['awgn_level'])
img = jpeg_compressor(img,params['jpeg_quality'])
img = up(img,params['updown_scale'],shape)
return img
\ No newline at end of file
import os,json
import subprocess
# ffmpeg 3.4.6
def args2cmd(args):
......@@ -32,17 +32,18 @@ def run(args,mode = 0):
return sout
def video2image(videopath, imagepath, fps=0, start_time='00:00:00', last_time='00:00:00'):
args = ['ffmpeg', '-i', '"'+videopath+'"']
args = ['ffmpeg']
if last_time != '00:00:00':
args += ['-ss', start_time]
args += ['-t', last_time]
args += ['-i', '"'+videopath+'"']
if fps != 0:
args += ['-r', str(fps)]
args += ['-f', 'image2','-q:v','-0',imagepath]
run(args)
def video2voice(videopath, voicepath, start_time='00:00:00', last_time='00:00:00'):
args = ['ffmpeg', '-i', '"'+videopath+'"','-f mp3','-b:a 320k']
args = ['ffmpeg', '-i', '"'+videopath+'"','-async 1 -f mp3','-b:a 320k']
if last_time != '00:00:00':
args += ['-ss', start_time]
args += ['-t', last_time]
......
import cv2
import numpy as np
import random
from threading import Thread
import platform
......@@ -8,23 +9,14 @@ system_type = 'Linux'
if 'Windows' in platform.platform():
system_type = 'Windows'
DCT_Q = np.array([[8,16,19,22,26,27,29,34],
[16,16,22,24,27,29,34,37],
[19,22,26,27,29,34,34,38],
[22,22,26,27,29,34,37,40],
[22,26,27,29,32,35,40,48],
[26,27,29,32,35,40,48,58],
[26,27,29,34,38,46,56,59],
[27,29,35,38,46,56,69,83]])
def imread(file_path,mod = 'normal',loadsize = 0):
def imread(file_path,mod = 'normal',loadsize = 0, rgb=False):
'''
mod: 'normal' | 'gray' | 'all'
loadsize: 0->original
'''
if system_type == 'Linux':
if mod == 'normal':
img = cv2.imread(file_path)
img = cv2.imread(file_path,1)
elif mod == 'gray':
img = cv2.imread(file_path,0)
elif mod == 'all':
......@@ -33,25 +25,37 @@ def imread(file_path,mod = 'normal',loadsize = 0):
#In windows, for chinese path, use cv2.imdecode insteaded.
#It will loss EXIF, I can't fix it
else:
if mod == 'gray':
if mod == 'normal':
img = cv2.imdecode(np.fromfile(file_path,dtype=np.uint8),1)
elif mod == 'gray':
img = cv2.imdecode(np.fromfile(file_path,dtype=np.uint8),0)
else:
elif mod == 'all':
img = cv2.imdecode(np.fromfile(file_path,dtype=np.uint8),-1)
if loadsize != 0:
img = resize(img, loadsize, interpolation=cv2.INTER_CUBIC)
if rgb and img.ndim==3:
img = img[:,:,::-1]
return img
def imwrite(file_path,img):
def imwrite(file_path,img,use_thread=False):
'''
in other to save chinese path images in windows,
this fun just for save final output images
'''
if system_type == 'Linux':
cv2.imwrite(file_path, img)
def subfun(file_path,img):
if system_type == 'Linux':
cv2.imwrite(file_path, img)
else:
cv2.imencode('.jpg', img)[1].tofile(file_path)
if use_thread:
t = Thread(target=subfun,args=(file_path, img,))
t.daemon()
t.start
else:
cv2.imencode('.jpg', img)[1].tofile(file_path)
subfun(file_path,img)
def resize(img,size,interpolation=cv2.INTER_LINEAR):
'''
......@@ -108,6 +112,12 @@ def color_adjust(img,alpha=0,beta=0,b=0,g=0,r=0,ran = False):
return (np.clip(img,0,255)).astype('uint8')
def CAdaIN(src,dst):
'''
make src has dst's style
'''
return np.std(dst)*((src-np.mean(src))/np.std(src))+np.mean(dst)
def makedataset(target_image,orgin_image):
target_image = resize(target_image,256)
orgin_image = resize(orgin_image,256)
......@@ -116,34 +126,6 @@ def makedataset(target_image,orgin_image):
img[0:256,0:256] = target_image[0:256,int(w/2-256/2):int(w/2+256/2)]
img[0:256,256:512] = orgin_image[0:256,int(w/2-256/2):int(w/2+256/2)]
return img
def block_dct_and_idct(g,QQF,QQF_16):
return cv2.idct(np.round(16.0*cv2.dct(g)/QQF)*QQF_16)
def image_dct_and_idct(I,QF):
h,w = I.shape
QQF = DCT_Q*QF
QQF_16 = QQF/16.0
for i in range(h//8):
for j in range(w//8):
I[i*8:(i+1)*8,j*8:(j+1)*8] = cv2.idct(np.round(16.0*cv2.dct(I[i*8:(i+1)*8,j*8:(j+1)*8])/QQF)*QQF_16)
#I[i*8:(i+1)*8,j*8:(j+1)*8] = block_dct_and_idct(I[i*8:(i+1)*8,j*8:(j+1)*8],QQF,QQF_16)
return I
def dctblur(img,Q):
'''
Q: 1~20, 1->best
'''
h,w = img.shape[:2]
img = img[:8*(h//8),:8*(w//8)]
img = img.astype(np.float32)
if img.ndim == 2:
img = image_dct_and_idct(img, Q)
if img.ndim == 3:
h,w,ch = img.shape
for i in range(ch):
img[:,:,i] = image_dct_and_idct(img[:,:,i], Q)
return (np.clip(img,0,255)).astype(np.uint8)
def find_mostlikely_ROI(mask):
contours,hierarchy=cv2.findContours(mask, cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
......@@ -210,6 +192,30 @@ def mask_area(mask):
area = 0
return area
def replace_mosaic(img_origin,img_fake,mask,x,y,size,no_feather):
img_fake = cv2.resize(img_fake,(size*2,size*2),interpolation=cv2.INTER_CUBIC)
if no_feather:
img_origin[y-size:y+size,x-size:x+size]=img_fake
return img_origin
else:
# #color correction
# RGB_origin = img_origin[y-size:y+size,x-size:x+size].mean(0).mean(0)
# RGB_fake = img_fake.mean(0).mean(0)
# for i in range(3):img_fake[:,:,i] = np.clip(img_fake[:,:,i]+RGB_origin[i]-RGB_fake[i],0,255)
#eclosion
eclosion_num = int(size/10)+2
mask_crop = cv2.resize(mask,(img_origin.shape[1],img_origin.shape[0]))[y-size:y+size,x-size:x+size]
mask_crop = ch_one2three(mask_crop)
mask_crop = (cv2.blur(mask_crop, (eclosion_num, eclosion_num)))
mask_crop = mask_crop/255.0
img_crop = img_origin[y-size:y+size,x-size:x+size]
img_origin[y-size:y+size,x-size:x+size] = np.clip((img_crop*(1-mask_crop)+img_fake*mask_crop),0,255).astype('uint8')
return img_origin
def Q_lapulase(resImg):
'''
......@@ -223,31 +229,25 @@ def Q_lapulase(resImg):
score = res.var()
return score
def replace_mosaic(img_origin,img_fake,mask,x,y,size,no_feather):
img_fake = cv2.resize(img_fake,(size*2,size*2),interpolation=cv2.INTER_LANCZOS4)
if no_feather:
img_origin[y-size:y+size,x-size:x+size]=img_fake
img_result = img_origin
else:
#color correction
RGB_origin = img_origin[y-size:y+size,x-size:x+size].mean(0).mean(0)
RGB_fake = img_fake.mean(0).mean(0)
for i in range(3):img_fake[:,:,i] = np.clip(img_fake[:,:,i]+RGB_origin[i]-RGB_fake[i],0,255)
#eclosion
eclosion_num = int(size/5)
entad = int(eclosion_num/2+2)
mask = cv2.resize(mask,(img_origin.shape[1],img_origin.shape[0]))
mask = ch_one2three(mask)
mask = (cv2.blur(mask, (eclosion_num, eclosion_num)))
mask_tmp = np.zeros_like(mask)
mask_tmp[y-size:y+size,x-size:x+size] = mask[y-size:y+size,x-size:x+size]# Fix edge overflow
mask = mask_tmp/255.0
img_tmp = np.zeros(img_origin.shape)
img_tmp[y-size:y+size,x-size:x+size]=img_fake
img_result = img_origin.copy()
img_result = (img_origin*(1-mask)+img_tmp*mask).astype('uint8')
def psnr(img1,img2):
mse = np.mean((img1/255.0-img2/255.0)**2)
if mse < 1e-10:
return 100
psnr_v = 20*np.log10(1/np.sqrt(mse))
return psnr_v
def splice(imgs,splice_shape):
'''Stitching multiple images, all imgs must have the same size
imgs : [img1,img2,img3,img4]
splice_shape: (2,2)
'''
h,w,ch = imgs[0].shape
output = np.zeros((h*splice_shape[0],w*splice_shape[1],ch),np.uint8)
cnt = 0
for i in range(splice_shape[0]):
for j in range(splice_shape[1]):
if cnt < len(imgs):
output[h*i:h*(i+1),w*j:w*(j+1)] = imgs[cnt]
cnt += 1
return output
return img_result
\ No newline at end of file
import json
import os
import random
import string
import shutil
def Traversal(filedir):
......@@ -10,6 +13,9 @@ def Traversal(filedir):
Traversal(dir)
return file_list
def randomstr(num):
return ''.join(random.sample(string.ascii_letters + string.digits, num))
def is_img(path):
ext = os.path.splitext(path)[1]
ext = ext.lower()
......@@ -47,13 +53,25 @@ def is_dirs(paths):
tmp.append(path)
return tmp
def writelog(path,log,isprint=False):
def writelog(path,log,isprint=False):
f = open(path,'a+')
f.write(log+'\n')
f.close()
if isprint:
print(log)
def savejson(path,data_dict):
json_str = json.dumps(data_dict)
f = open(path,'w+')
f.write(json_str)
f.close()
def loadjson(path):
f = open(path, 'r')
txt_data = f.read()
f.close()
return json.loads(txt_data)
def makedirs(path):
if os.path.isdir(path):
print(path,'existed')
......@@ -70,12 +88,13 @@ def clean_tempfiles(opt,tmp_init=True):
os.makedirs(tmpdir)
os.makedirs(os.path.join(tmpdir, 'video2image'))
os.makedirs(os.path.join(tmpdir, 'addmosaic_image'))
os.makedirs(os.path.join(tmpdir, 'mosaic_crop'))
os.makedirs(os.path.join(tmpdir, 'replace_mosaic'))
os.makedirs(os.path.join(tmpdir, 'mosaic_mask'))
os.makedirs(os.path.join(tmpdir, 'ROI_mask'))
os.makedirs(os.path.join(tmpdir, 'ROI_mask_check'))
os.makedirs(os.path.join(tmpdir, 'style_transfer'))
# make dataset
os.makedirs(os.path.join(tmpdir, 'mosaic_crop'))
os.makedirs(os.path.join(tmpdir, 'ROI_mask_check'))
def file_init(opt):
if not os.path.isdir(opt.result_dir):
......@@ -90,11 +109,16 @@ def second2stamp(s):
s = int(s%60)
return "%02d:%02d:%02d" % (h, m, s)
def counttime(t1,t2,now_num,all_num):
def stamp2second(stamp):
substamps = stamp.split(':')
return int(substamps[0])*3600 + int(substamps[1])*60 + int(substamps[2])
def counttime(start_time,current_time,now_num,all_num):
'''
t1,t2: time.time()
start_time,current_time: time.time()
'''
used_time = int(t2-t1)
used_time = int(current_time-start_time)
all_time = int(used_time/now_num*all_num)
return second2stamp(used_time)+'/'+second2stamp(all_time)
......