提交 dc7d1f10 编写于 作者: L LielinJiang

Merge branch 'master' of https://github.com/PaddlePaddle/PaddleGAN into release/0.1.0

<div align='center'>
<img src='./docs/imgs/ppgan.jpg'>
</div>
简体中文 | [English](./README_en.md) 简体中文 | [English](./README_en.md)
# PaddleGAN # PaddleGAN
PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包. 飞桨生成对抗网络开发套件--PaddleGAN,为开发者提供经典及前沿的生成对抗网络高性能实现,并支撑开发者快速构建、训练及部署生成对抗网络,以供学术、娱乐及产业应用。
### 图片变换 GAN--生成对抗网络,被“卷积网络之父”**Yann LeCun(杨立昆)**誉为**「过去十年计算机科学领域最有趣的想法之一」**,是近年来火遍全网,AI研究者最为关注的深度学习技术方向之一。
<div align='center'>
<img src='./docs/imgs/A2B.png'>
</div>
<div align='center'>
<img src='./docs/imgs/B2A.png'>
</div>
### 妆容迁移
<div align='center'>
<img src='./docs/imgs/makeup_shifter.png'>
</div>
### 老视频修复
<div align='center'>
<img src='./docs/imgs/color_sr_peking.gif'>
</div>
### 超分辨率
<div align='center'> <div align='center'>
<img src='./docs/imgs/sr_demo.png'> <img src='./docs/imgs/ppgan.jpg'>
</div>
### 动作驱动
<div align='center'>
<img src='./docs/imgs/first_order.gif'>
</div> </div>
特性: [![License](https://img.shields.io/badge/license-Apache%202-red.svg)](LICENSE)![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
- 高度的灵活性:
模块化设计,解耦各个网络组件,开发者轻松搭建、试用各种检测模型及优化策略,快速得到高性能、定制化的算法。
- 丰富的应用:
PaddleGAN 提供了非常多的应用,比如说图像生成,图像修复,图像上色,视频补帧,人脸妆容迁移等.
## 安装
请参考[安装文档](./docs/zh_CN/install.md)来进行PaddlePaddle和ppgan的安装
## 快速开始 ## 快速开始
通过ppgan.app接口使用预训练模型: * 请确保您按照[安装文档](./docs/zh_CN/install.md)的说明正确安装了PaddlePaddle和PaddleGAN
* 通过ppgan.app接口使用预训练模型:
```python ```python
from ppgan.apps import RealSRPredictor from ppgan.apps import RealSRPredictor
...@@ -60,28 +25,64 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包. ...@@ -60,28 +25,64 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包.
sr.run("docs/imgs/monarch.png") sr.run("docs/imgs/monarch.png")
``` ```
更多训练、评估教程参考: * 更多训练、评估教程:
* [数据准备](./docs/zh_CN/data_prepare.md)
- [数据准备](./docs/zh_CN/data_prepare.md) * [训练/评估/推理教程](./docs/zh_CN/get_started.md)
- [训练/评估/推理教程](./docs/zh_CN/get_started.md)
## 模型教程 ## 经典模型实现
* [Pixel2Pixel](./docs/zh_CN/tutorials/pix2pix_cyclegan.md) * [Pixel2Pixel](./docs/zh_CN/tutorials/pix2pix_cyclegan.md)
* [CycleGAN](./docs/zh_CN/tutorials/pix2pix_cyclegan.md) * [CycleGAN](./docs/zh_CN/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/zh_CN/tutorials/psgan.md) * [PSGAN](./docs/zh_CN/tutorials/psgan.md)
* [First Order Motion Model](./docs/zh_CN/tutorials/motion_driving.md) * [First Order Motion Model](./docs/zh_CN/tutorials/motion_driving.md)
## 复合应用
* [视频修复](./docs/zh_CN/tutorials/video_restore.md) * [视频修复](./docs/zh_CN/tutorials/video_restore.md)
## 在线体验 ## 在线教程
通过[AI Studio实训平台](https://aistudio.baidu.com/aistudio/index)在线体验: 您可以通过[人工智能学习与实训社区AI Studio](https://aistudio.baidu.com/aistudio/index) 的示例工程在线体验PaddleGAN的部分能力:
|在线教程 | 链接 | |在线教程 | 链接 |
|--------------|-----------| |--------------|-----------|
|老北京视频修复|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/1161285)| |老北京视频修复|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/1161285)|
|表情动作迁移-当苏大强唱起unravel |[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/1048840)| |表情动作迁移-当苏大强唱起unravel |[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/1048840)|
## 效果展示
### 图片变换
<div align='center'>
<img src='./docs/imgs/horse2zebra.gif'width='700' height='200'/>
</div>
### 老视频修复
<div align='center'>
<img src='./docs/imgs/color_sr_peking.gif' width='700'/>
</div>
### 动作迁移
<div align='center'>
<img src='./docs/imgs/first_order.gif' width='700'/>
</div>
### 超分辨率
<div align='center'>
<img src='./docs/imgs/sr_demo.png'width='700' height='250'/>
</div>
### 妆容迁移
<div align='center'>
<img src='./docs/imgs/makeup_shifter.png'width='700' height='250'/>
</div>
## 版本更新 ## 版本更新
...@@ -89,10 +90,24 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包. ...@@ -89,10 +90,24 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包.
- 初版发布,支持Pixel2Pixel、CycleGAN、PSGAN模型,支持视频插针、超分、老照片/视频上色、视频动作生成等应用。 - 初版发布,支持Pixel2Pixel、CycleGAN、PSGAN模型,支持视频插针、超分、老照片/视频上色、视频动作生成等应用。
- 模块化设计,接口简单易用。 - 模块化设计,接口简单易用。
## PaddleGAN 特别兴趣小组(Special Interest Group)
最早于1961年被[ACM(Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery)首次提出并使用,国际顶尖开源组织包括[Kubernates](https://kubernetes.io/)都采用SIGs的形式,使拥有同样特定兴趣的成员可以共同分享、学习知识并进行项目开发。这些成员不需要在同一国家/地区、同一个组织,只要大家志同道合,都可以奔着相同的目标一同学习、工作、玩耍~
PaddleGAN SIG就是这样一个汇集对GAN感兴趣小伙伴们的开发者组织,在这里,有百度飞桨的一线开发人员、有来自世界500强的资深工程师、有国内外顶尖高校的学生。
我们正在持续招募有兴趣、有能力的开发者加入我们一起共同建设本项目,并一起探索更多有用、有趣的应用。
[PaddleGAN QQ 群号:1058398620]
<div align='center'>
<img src='./docs/imgs/qq.png'width='250' height='300'/>
</div>
## 贡献代码 ## 贡献代码
我们非常欢迎您可以为PaddleGAN提供任何贡献和建议。大多数贡献都需要同意参与者许可协议(CLA)。当提交拉取请求时,CLA机器人会自动检查您是否需要提供CLA。 只需要按照机器人提供的说明进行操作即可。CLA只需要同意一次,就能应用到所有的代码仓库上。关于更多的流程请参考[贡献指南](docs/zh_CN/contribute.md) 我们非常欢迎您可以为PaddleGAN提供任何贡献和建议。大多数贡献都需要同意参与者许可协议(CLA)。当提交拉取请求时,CLA机器人会自动检查您是否需要提供CLA。 只需要按照机器人提供的说明进行操作即可。CLA只需要同意一次,就能应用到所有的代码仓库上。关于更多的流程请参考[贡献指南](docs/zh_CN/contribute.md)
## 许可证书 ## 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。 本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
<div align='center'>
<img src='./docs/imgs/ppgan.jpg'>
</div>
English | [简体中文](./README.md) English | [简体中文](./README.md)
# PaddleGAN # PaddleGAN
PaddleGAN is an development kit of Generative Adversarial Network based on PaddlePaddle. PaddleGAN provides developers with high-performance implementation of classic and SOTA Generative Adversarial Networks, and support developers to quickly build, train and deploy GANs for academic, entertainment and industrial usage.
GAN-Generative Adversarial Network, was praised by "the Father of Convolutional Networks" **Yann LeCun (Yang Likun)** as **[One of the most interesting ideas in the field of computer science in the past decade]**. It's one the research area in deep learning that AI researchers are most concerned about.
### Image Translation
<div align='center'>
<img src='./docs/imgs/A2B.png'>
</div>
<div align='center'> <div align='center'>
<img src='./docs/imgs/B2A.png'> <img src='./docs/imgs/ppgan.jpg'>
</div> </div>
[![License](https://img.shields.io/badge/license-Apache%202-red.svg)](LICENSE)![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
### Makeup shifter ## Quick Start
<div align='center'>
<img src='./docs/imgs/makeup_shifter.png'>
</div>
* Please refer [install](./docs/en_US/install.md) to ensure you sucessfully installed PaddlePaddle and PaddleGAN.
* Get started through ppgan.app interface:
```python
from ppgan.apps import RealSRPredictor
sr = RealSRPredictor()
sr.run("docs/imgs/monarch.png")
```
* More tutorials:
- [Data preparation](./docs/en_US/data_prepare.md)
- [Training/Evaluating/Testing basic usage](./docs/zh_CN/get_started.md)
## Model Tutorial
* [Pixel2Pixel](./docs/en_US/tutorials/pix2pix_cyclegan.md)
* [CycleGAN](./docs/en_US/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/en_US/tutorials/psgan.md)
* [First Order Motion Model](./docs/en_US/tutorials/motion_driving.md)
## Composite Application
* [Video restore](./docs/zh_CN/tutorials/video_restore.md)
## Examples
### Image Translation
### Old video restore
<div align='center'> <div align='center'>
<img src='./docs/imgs/color_sr_peking.gif'> <img src='./docs/imgs/horse2zebra.gif'width='700' height='200'/>
</div> </div>
### Super resolution ### Old video restore
<div align='center'> <div align='center'>
<img src='./docs/imgs/sr_demo.png'> <img src='./docs/imgs/color_sr_peking.gif' width='700'/>
</div> </div>
### Motion driving ### Motion driving
<div align='center'> <div align='center'>
<img src='./docs/imgs/first_order.gif'> <img src='./docs/imgs/first_order.gif' width='700'>
</div> </div>
Features: ### Super resolution
- Highly Flexible:
Components are designed to be modular. Model architectures, as well as data <div align='center'>
preprocess pipelines, can be easily customized with simple configuration <img src='./docs/imgs/sr_demo.png'width='700' height='250'/>
changes. </div>
- Rich applications:
PaddleGAN provides rich of applications, such as image generation, image restore, image colorization, video interpolate, makeup shifter. ### Makeup shifter
## Install <div align='center'>
<img src='./docs/imgs/makeup_shifter.png'width='700' height='250'/>
</div>
Please refer to [install](./docs/en_US/install.md).
## Quick Start ## Changelog
Get started through ppgan.app interface: - v0.1.0 (2020.11.02)
- Release first version, supported models include Pixel2Pixel, CycleGAN, PSGAN. Supported applications include video frame interpolation, super resolution, colorize images and videos, image animation.
- Modular design and friendly interface.
```python ## PaddleGAN Special Interest Group(SIG)
from ppgan.apps import RealSRPredictor
sr = RealSRPredictor()
sr.run("docs/imgs/monarch.png")
```
More tutorials: It was first proposed and used by [ACM(Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery) in 1961. Top International open source organizations including [Kubernates](https://kubernetes.io/) all adopt the form of SIGs, so that members with the same specific interests can share, learn knowledge and develop projects. These members do not need to be in the same country/region or the same organization, as long as they are like-minded, they can all study, work, and play together with the same goals~
- [Data preparation](./docs/en_US/data_prepare.md) PaddleGAN SIG is such a developer organization that brings together people who interested in GAN. There are frontline developers of PaddlePaddle, senior engineers from the world's top 500, and students from top universities at home and abroad.
- [Traning/Evaluating/Testing basic usage](./docs/zh_CN/get_started.md)
## Model Tutorial We are continuing to recruit developers interested and capable to join us building this project and explore more useful and interesting applications together.
* [Pixel2Pixel](./docs/en_US/tutorials/pix2pix_cyclegan.md) [PaddleGAN QQ Group:1058398620]
* [CycleGAN](./docs/en_US/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/en_US/tutorials/psgan.md)
* [First Order Motion Model](./docs/en_US/tutorials/motion_driving.md)
* [Video restore](./docs/zh_CN/tutorials/video_restore.md)
<div align='center'>
<img src='./docs/imgs/qq.png'width='250' height='300'/>
</div>
## Changelog
- v0.1.0 (2020.11.02)
- Realse first version, supported models include Pixel2Pixel, CycleGAN, PSGAN. Supported applications include video frame interpolation, super resolution, colorize images and videos, image animation.
- Modular design and friendly interface.
## Contributing ## Contributing
......
...@@ -67,6 +67,10 @@ parser.add_argument('--mindim', ...@@ -67,6 +67,10 @@ parser.add_argument('--mindim',
default=360, default=360,
help='Length of minimum image edges') help='Length of minimum image edges')
# DeOldify args # DeOldify args
parser.add_argument('--artistic',
action='store_true',
default=False,
help='whether to use artistic DeOldify Model')
parser.add_argument('--render_factor', parser.add_argument('--render_factor',
type=int, type=int,
default=32, default=32,
...@@ -107,6 +111,7 @@ if __name__ == "__main__": ...@@ -107,6 +111,7 @@ if __name__ == "__main__":
elif order == 'DeOldify': elif order == 'DeOldify':
predictor = DeOldifyPredictor(args.output, predictor = DeOldifyPredictor(args.output,
weight_path=args.DeOldify_weight, weight_path=args.DeOldify_weight,
artistic=args.artistic,
render_factor=args.render_factor) render_factor=args.render_factor)
frames_path, temp_video_path = predictor.run(temp_video_path) frames_path, temp_video_path = predictor.run(temp_video_path)
elif order == 'RealSR': elif order == 'RealSR':
......
...@@ -13,9 +13,10 @@ ...@@ -13,9 +13,10 @@
Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file. Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file.
``` ```
python -u tools/first-order-demo.py \ cd applications/
--driving_video ./ravel_10.mp4 \ python -u tools/first-order-demo.py \
--source_image ./sudaqiang.png \ --driving_video ../docs/imgs/fom_dv.mp4 \
--source_image ../docs/imgs/fom_source_image.png \
--relative --adapt_scale --relative --adapt_scale
``` ```
......
...@@ -37,9 +37,12 @@ ...@@ -37,9 +37,12 @@
## 1.3 Results ## 1.3 Results
![](../imgs/horse2zebra.png) ![](../../imgs/horse2zebra.png)
[model download](TODO) ## 1.4 模型下载
| 模型 | 数据集 | 下载地址 |
|---|---|---|
| Pix2Pix_cityscapes | cityscapes | [Pix2Pix_cityscapes](https://paddlegan.bj.bcebos.com/models/Pix2Pix_cityscapes.pdparams)
...@@ -49,7 +52,7 @@ ...@@ -49,7 +52,7 @@
CycleGAN uses unpaired pictures for image translation, input two different images with different styles, and automatically perform style transfer. CycleGAN consists of two generators and two discriminators, generator A is inputting images of style A and outputting images of style B, generator B is inputting images of style B and outputting images of style A. The biggest difference between CycleGAN and pix2pix is that CycleGAN can realize image translation without establishing a one-to-one mapping between the source domain and the target domain. CycleGAN uses unpaired pictures for image translation, input two different images with different styles, and automatically perform style transfer. CycleGAN consists of two generators and two discriminators, generator A is inputting images of style A and outputting images of style B, generator B is inputting images of style B and outputting images of style A. The biggest difference between CycleGAN and pix2pix is that CycleGAN can realize image translation without establishing a one-to-one mapping between the source domain and the target domain.
![](../imgs/cyclegan.png) ![](../../imgs/cyclegan.png)
## 2.2 How to use ## 2.2 How to use
...@@ -87,9 +90,13 @@ ...@@ -87,9 +90,13 @@
## 2.3 Results ## 2.3 Results
![](../imgs/A2B.png) ![](../../imgs/A2B.png)
[model download](TODO) ## 2.4 模型下载
| 模型 | 数据集 | 下载地址 |
|---|---|---|
| CycleGAN_cityscapes | cityscapes | [CycleGAN_cityscapes](https://paddlegan.bj.bcebos.com/models/CycleGAN_cityscapes.pdparams) |
| CycleGAN_horse2zebra | horse2zebra | [CycleGAN_horse2zebra](https://paddlegan.bj.bcebos.com/models/CycleGAN_horse2zebra.pdparams)
# References # References
......
...@@ -10,15 +10,17 @@ This paper is to address the makeup transfer task, which aims to transfer the ma ...@@ -10,15 +10,17 @@ This paper is to address the makeup transfer task, which aims to transfer the ma
## 2. How to use ## 2. How to use
### 2.1 Test ### 2.1 Test
Pretrained model can be downloaded under following link: [psgan_weight](https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams)
Running the following command to complete the makeup transfer task. It will geneate the transfered image in the current path when the program running sucessfully. Running the following command to complete the makeup transfer task. It will geneate the transfered image in the current path when the program running sucessfully.
``` ```
cd applications python tools/psgan_infer.py \
python tools/ps_demo.py \
--config-file configs/makeup.yaml \ --config-file configs/makeup.yaml \
--model_path /your/model/path \ --model_path /your/model/path \
--source_path /your/source/image/path \ --source_path docs/imgs/ps_source.png \
--reference_dir /your/ref/image/path --reference_dir docs/imgs/ref/ps_ref \
--evaluate-only True
``` ```
**params:** **params:**
- config-file: PSGAN network configuration file, yaml format - config-file: PSGAN network configuration file, yaml format
...@@ -33,14 +35,14 @@ python tools/ps_demo.py \ ...@@ -33,14 +35,14 @@ python tools/ps_demo.py \
``` ```
mv landmarks/makeup MT-Dataset/landmarks/makeup mv landmarks/makeup MT-Dataset/landmarks/makeup
mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup
mv landmarks/train_makeup.txt MT-Dataset/makeup.txt cp landmarks/train_makeup.txt MT-Dataset/train_makeup.txt
mv tlandmarks/train_non-makeup.txt MT-Dataset/non-makeup.txt cp landmarks/train_non-makeup.txt MT-Dataset/train_non-makeup.txt
``` ```
The final data directory should be looked like: The final data directory should be looked like:
``` ```
data data/MT-Dataset
├── images ├── images
│ ├── makeup │ ├── makeup
│ └── non-makeup │ └── non-makeup
...@@ -77,7 +79,7 @@ Notation: In train phase, the `isTrain` value in makeup.yaml file is `True`, but ...@@ -77,7 +79,7 @@ Notation: In train phase, the `isTrain` value in makeup.yaml file is `True`, but
Model|Dataset|BatchSize|Inference speed|Download Model|Dataset|BatchSize|Inference speed|Download
---|:--:|:--:|:--:|:--: ---|:--:|:--:|:--:|:--:
PSGAN|MT-Dataset| 1 | 1.9s/image (GPU:P40) | [model]() PSGAN|MT-Dataset| 1 | 1.9s/image (GPU:P40) | [model](https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams)
## 3. Result ## 3. Result
![](../../imgs/makeup_shifter.png) ![](../../imgs/makeup_shifter.png)
......
../../zh_CN/tutorials/video_restore.md
\ No newline at end of file
因为 它太大了无法显示 image diff 。你可以改为 查看blob
文件已添加
因为 它太大了无法显示 image diff 。你可以改为 查看blob
docs/imgs/makeup_shifter.png

507.4 KB | W: | H:

docs/imgs/makeup_shifter.png

1.5 MB | W: | H:

docs/imgs/makeup_shifter.png
docs/imgs/makeup_shifter.png
docs/imgs/makeup_shifter.png
docs/imgs/makeup_shifter.png
  • 2-up
  • Swipe
  • Onion skin
docs/imgs/ppgan.jpg

474.0 KB | W: | H:

docs/imgs/ppgan.jpg

542.7 KB | W: | H:

docs/imgs/ppgan.jpg
docs/imgs/ppgan.jpg
docs/imgs/ppgan.jpg
docs/imgs/ppgan.jpg
  • 2-up
  • Swipe
  • Onion skin
...@@ -39,6 +39,7 @@ ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32 ...@@ -39,6 +39,7 @@ ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32
> >
> > - output (str): 设置输出图片的保存路径,默认是output。注意,保存路径为设置output/DeOldify。 > > - output (str): 设置输出图片的保存路径,默认是output。注意,保存路径为设置output/DeOldify。
> > - weight_path (str): 指定模型路径,默认是None,则会自动下载内置的已经训练好的模型。 > > - weight_path (str): 指定模型路径,默认是None,则会自动下载内置的已经训练好的模型。
> > - artistic (bool): 是否使用偏"艺术性"的模型。"艺术性"的模型有可能产生一些有趣的颜色,但是毛刺比较多。
> > - render_factor (int): 图片渲染上色时的缩放因子,图片会缩放到边长为16xrender_factor的正方形, 再上色,例如render_factor默认值为32,输入图片先缩放到(16x32=512) 512x512大小的图片。通常来说,render_factor越小,计算速度越快,颜色看起来也更鲜活。较旧和较低质量的图像通常会因降低渲染因子而受益。渲染因子越高,图像质量越好,但颜色可能会稍微褪色。 > > - render_factor (int): 图片渲染上色时的缩放因子,图片会缩放到边长为16xrender_factor的正方形, 再上色,例如render_factor默认值为32,输入图片先缩放到(16x32=512) 512x512大小的图片。通常来说,render_factor越小,计算速度越快,颜色看起来也更鲜活。较旧和较低质量的图像通常会因降低渲染因子而受益。渲染因子越高,图像质量越好,但颜色可能会稍微褪色。
### run ### run
......
...@@ -17,9 +17,10 @@ First order motion model的任务是image animation,给定一张源图片, ...@@ -17,9 +17,10 @@ First order motion model的任务是image animation,给定一张源图片,
用户可以上传自己准备的视频和图片,并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径,然后运行如下命令,就可以完成动作表情迁移,程序运行成功后,会在ouput文件夹生成名为result.mp4的视频文件,该文件即为动作迁移后的视频。本项目中提供了原始图片和驱动视频供展示使用。运行的命令如下所示: 用户可以上传自己准备的视频和图片,并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径,然后运行如下命令,就可以完成动作表情迁移,程序运行成功后,会在ouput文件夹生成名为result.mp4的视频文件,该文件即为动作迁移后的视频。本项目中提供了原始图片和驱动视频供展示使用。运行的命令如下所示:
``` ```
cd applications/
python -u tools/first-order-demo.py \ python -u tools/first-order-demo.py \
--driving_video ./ravel_10.mp4 \ --driving_video ../docs/imgs/fom_dv.mp4 \
--source_image ./sudaqiang.png \ --source_image ../docs/imgs/fom_source_image.png \
--relative --adapt_scale --relative --adapt_scale
``` ```
......
...@@ -40,7 +40,10 @@ ...@@ -40,7 +40,10 @@
![](../../imgs/horse2zebra.png) ![](../../imgs/horse2zebra.png)
[模型下载](TODO) ## 1.4 模型下载
| 模型 | 数据集 | 下载地址 |
|---|---|---|
| Pix2Pix_cityscapes | cityscapes | [Pix2Pix_cityscapes](https://paddlegan.bj.bcebos.com/models/Pix2Pix_cityscapes.pdparams)
# 2 CycleGAN # 2 CycleGAN
...@@ -88,7 +91,11 @@ ...@@ -88,7 +91,11 @@
![](../../imgs/A2B.png) ![](../../imgs/A2B.png)
[模型下载](TODO) ## 2.4 模型下载
| 模型 | 数据集 | 下载地址 |
|---|---|---|
| CycleGAN_cityscapes | cityscapes | [CycleGAN_cityscapes](https://paddlegan.bj.bcebos.com/models/CycleGAN_cityscapes.pdparams) |
| CycleGAN_horse2zebra | horse2zebra | [CycleGAN_horse2zebra](https://paddlegan.bj.bcebos.com/models/CycleGAN_horse2zebra.pdparams)
# 参考: # 参考:
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## 1. PSGAN原理 ## 1. PSGAN原理
[PSGAN](https://arxiv.org/abs/1909.06956)模型的任务是妆容迁移, 即将任意参照图像上的妆容迁移到不带妆容的源图像上。很多人像美化应用都需要这种技术。近来的一些妆容迁移方法大都基于生成对抗网络(GAN)。它们通常采用 CycleGAN 的框架,并在两个数据集上进行训练,即无妆容图像和有妆容图像。但是,现有的方法存在一个局限性:只在正面人脸图像上表现良好,没有为处理源图像和参照图像之间的姿态和表情差异专门设计模块。PSGAN是一种全新的姿态稳健可感知空间的生成对抗网络。PSGAN 主要分为三部分:妆容提炼网络(MDNet)、注意式妆容变形(AMM)模块和卸妆-再化妆网络(DRNet)。这三种新提出的模块能让 PSGAN 具备上述的完美妆容迁移模型所应具备的能力。 [PSGAN](https://arxiv.org/abs/1909.06956)模型的任务是妆容迁移, 即将任意参照图像上的妆容迁移到不带妆容的源图像上。很多人像美化应用都需要这种技术。近来的一些妆容迁移方法大都基于生成对抗网络(GAN)。它们通常采用 CycleGAN 的框架,并在两个数据集上进行训练,即无妆容图像和有妆容图像。但是,现有的方法存在一个局限性:只在正面人脸图像上表现良好,没有为处理源图像和参照图像之间的姿态和表情差异专门设计模块。PSGAN是一种全新的姿态稳健可感知空间的生成对抗网络。PSGAN 主要分为三部分:妆容提炼网络(MDNet)、注意式妆容变形(AMM)模块和卸妆-再化妆网络(DRNet)。这三种新提出的模块能让 PSGAN 具备上述的完美妆容迁移模型所应具备的能力。
<div align="center"> <div align="center">
<img src="../../imgs/psgan_arc.png" width="800"/> <img src="../../imgs/psgan_arc.png" width="800"/>
...@@ -10,15 +10,17 @@ ...@@ -10,15 +10,17 @@
## 2. 使用方法 ## 2. 使用方法
### 2.1 测试 ### 2.1 测试
预训练模型可以从如下地址下载: [psgan_weight](https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams)
运行如下命令,就可以完成妆容迁移,程序运行成功后,会在当前文件夹生成妆容迁移后的图片文件。本项目中提供了原始图片和参考供展示使用,具体命令如下所示: 运行如下命令,就可以完成妆容迁移,程序运行成功后,会在当前文件夹生成妆容迁移后的图片文件。本项目中提供了原始图片和参考供展示使用,具体命令如下所示:
``` ```
cd applications/ python tools/psgan_infer.py \
python tools/ps_demo.py \
--config-file configs/makeup.yaml \ --config-file configs/makeup.yaml \
--model_path /your/model/path \ --model_path /your/model/path \
--source_path /your/source/image/path \ --source_path docs/imgs/ps_source.png \
--reference_dir /your/ref/image/path --reference_dir docs/imgs/ref/ps_ref \
--evaluate-only True
``` ```
**参数说明:** **参数说明:**
- config-file: PSGAN网络到参数配置文件,格式为yaml - config-file: PSGAN网络到参数配置文件,格式为yaml
...@@ -33,13 +35,13 @@ python tools/ps_demo.py \ ...@@ -33,13 +35,13 @@ python tools/ps_demo.py \
``` ```
mv landmarks/makeup MT-Dataset/landmarks/makeup mv landmarks/makeup MT-Dataset/landmarks/makeup
mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup
mv landmarks/train_makeup.txt MT-Dataset/makeup.txt cp landmarks/train_makeup.txt MT-Dataset/train_makeup.txt
mv tlandmarks/train_non-makeup.txt MT-Dataset/non-makeup.txt cp landmarks/train_non-makeup.txt MT-Dataset/train_non-makeup.txt
``` ```
最后数据集目录如下所示: 最后数据集目录如下所示:
``` ```
data data/MT-Dataset
├── images ├── images
│   ├── makeup │   ├── makeup
│   └── non-makeup │   └── non-makeup
...@@ -73,7 +75,7 @@ data ...@@ -73,7 +75,7 @@ data
### 2.3 模型 ### 2.3 模型
Model|Dataset|BatchSize|Inference speed|Download Model|Dataset|BatchSize|Inference speed|Download
---|:--:|:--:|:--:|:--: ---|:--:|:--:|:--:|:--:
PSGAN|MT-Dataset| 1 | 1.9s(GPU:P40) | [model]() PSGAN|MT-Dataset| 1 | 1.9s(GPU:P40) | [model](https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams)
## 3. 妆容迁移结果展示 ## 3. 妆容迁移结果展示
......
...@@ -63,6 +63,7 @@ ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32 ...@@ -63,6 +63,7 @@ ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32
- `output (str,可选的)`: 输出的文件夹路径,默认值:`output`. - `output (str,可选的)`: 输出的文件夹路径,默认值:`output`.
- `weight_path (None,可选的)`: 载入的权重路径,如果没有设置,则从云端下载默认的权重到本地。默认值:`None` - `weight_path (None,可选的)`: 载入的权重路径,如果没有设置,则从云端下载默认的权重到本地。默认值:`None`
- `artistic (bool)`: 是否使用偏"艺术性"的模型。"艺术性"的模型有可能产生一些有趣的颜色,但是毛刺比较多。
- `render_factor (int)`: 会将该参数乘以16后作为输入帧的resize的值,如果该值设置为32, - `render_factor (int)`: 会将该参数乘以16后作为输入帧的resize的值,如果该值设置为32,
则输入帧会resize到(32 * 16, 32 * 16)的尺寸再输入到网络中。 则输入帧会resize到(32 * 16, 32 * 16)的尺寸再输入到网络中。
......
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
import os import os
import cv2 import cv2
import numpy as np
from PIL import Image from PIL import Image
import paddle import paddle
...@@ -64,9 +65,16 @@ class BasePredictor(object): ...@@ -64,9 +65,16 @@ class BasePredictor(object):
def is_image(self, input): def is_image(self, input):
try: try:
img = Image.open(input) if isinstance(input, (np.ndarray, Image.Image)):
_ = img.size return True
return True elif isinstance(input, str):
if not os.path.isfile(input):
raise ValueError('input must be a file')
img = Image.open(input)
_ = img.size
return True
else:
return False
except: except:
return False return False
......
...@@ -82,14 +82,9 @@ class DAINPredictor(BasePredictor): ...@@ -82,14 +82,9 @@ class DAINPredictor(BasePredictor):
vidname = video_path.split('/')[-1].split('.')[0] vidname = video_path.split('/')[-1].split('.')[0]
frames = sorted(glob.glob(os.path.join(out_path, '*.png'))) frames = sorted(glob.glob(os.path.join(out_path, '*.png')))
orig_frames = len(frames)
need_frames = orig_frames * times_interp
if self.remove_duplicates: if self.remove_duplicates:
frames = self.remove_duplicate_frames(out_path) frames = self.remove_duplicate_frames(out_path)
left_frames = len(frames)
timestep = left_frames / need_frames
num_frames = int(1.0 / timestep) - 1
img = imread(frames[0]) img = imread(frames[0])
...@@ -125,9 +120,11 @@ class DAINPredictor(BasePredictor): ...@@ -125,9 +120,11 @@ class DAINPredictor(BasePredictor):
if not os.path.exists(os.path.join(frame_path_combined, vidname)): if not os.path.exists(os.path.join(frame_path_combined, vidname)):
os.makedirs(os.path.join(frame_path_combined, vidname)) os.makedirs(os.path.join(frame_path_combined, vidname))
for i in tqdm(range(frame_num - 1)): for i in range(frame_num - 1):
first = frames[i] first = frames[i]
second = frames[i + 1] second = frames[i + 1]
first_index = int(first.split('/')[-1].split('.')[-2])
second_index = int(second.split('/')[-1].split('.')[-2])
img_first = imread(first) img_first = imread(first)
img_second = imread(second) img_second = imread(second)
...@@ -173,22 +170,43 @@ class DAINPredictor(BasePredictor): ...@@ -173,22 +170,43 @@ class DAINPredictor(BasePredictor):
padding_left:padding_left + int_width], padding_left:padding_left + int_width],
(1, 2, 0)) for item in y_ (1, 2, 0)) for item in y_
] ]
time_offsets = [kk * timestep for kk in range(1, 1 + num_frames, 1)] if self.remove_duplicates:
num_frames = times_interp * (second_index - first_index) - 1
count = 1 time_offsets = [
for item, time_offset in zip(y_, time_offsets): kk * timestep for kk in range(1, 1 + num_frames, 1)
out_dir = os.path.join(frame_path_interpolated, vidname, ]
"{:0>6d}_{:0>4d}.png".format(i, count)) start = times_interp * first_index + 1
count = count + 1 for item, time_offset in zip(y_, time_offsets):
imsave(out_dir, np.round(item).astype(np.uint8)) out_dir = os.path.join(frame_path_interpolated, vidname,
"{:08d}.png".format(start))
num_frames = int(1.0 / timestep) - 1 imsave(out_dir, np.round(item).astype(np.uint8))
start = start + 1
else:
time_offsets = [
kk * timestep for kk in range(1, 1 + num_frames, 1)
]
count = 1
for item, time_offset in zip(y_, time_offsets):
out_dir = os.path.join(
frame_path_interpolated, vidname,
"{:0>6d}_{:0>4d}.png".format(i, count))
count = count + 1
imsave(out_dir, np.round(item).astype(np.uint8))
input_dir = os.path.join(frame_path_input, vidname) input_dir = os.path.join(frame_path_input, vidname)
interpolated_dir = os.path.join(frame_path_interpolated, vidname) interpolated_dir = os.path.join(frame_path_interpolated, vidname)
combined_dir = os.path.join(frame_path_combined, vidname) combined_dir = os.path.join(frame_path_combined, vidname)
self.combine_frames(input_dir, interpolated_dir, combined_dir,
num_frames) if self.remove_duplicates:
self.combine_frames_with_rm(input_dir, interpolated_dir,
combined_dir, times_interp)
else:
num_frames = int(1.0 / timestep) - 1
self.combine_frames(input_dir, interpolated_dir, combined_dir,
num_frames)
frame_pattern_combined = os.path.join(frame_path_combined, vidname, frame_pattern_combined = os.path.join(frame_path_combined, vidname,
'%08d.png') '%08d.png')
...@@ -223,6 +241,26 @@ class DAINPredictor(BasePredictor): ...@@ -223,6 +241,26 @@ class DAINPredictor(BasePredictor):
except Exception as e: except Exception as e:
print(e) print(e)
def combine_frames_with_rm(self, input, interpolated, combined,
times_interp):
frames1 = sorted(glob.glob(os.path.join(input, '*.png')))
frames2 = sorted(glob.glob(os.path.join(interpolated, '*.png')))
num1 = len(frames1)
num2 = len(frames2)
for i in range(num1):
src = frames1[i]
index = int(src.split('/')[-1].split('.')[-2])
dst = os.path.join(combined,
'{:08d}.png'.format(times_interp * index))
shutil.copy2(src, dst)
for i in range(num2):
src = frames2[i]
imgname = src.split('/')[-1]
dst = os.path.join(combined, imgname)
shutil.copy2(src, dst)
def remove_duplicate_frames(self, paths): def remove_duplicate_frames(self, paths):
def dhash(image, hash_size=8): def dhash(image, hash_size=8):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
...@@ -241,14 +279,19 @@ class DAINPredictor(BasePredictor): ...@@ -241,14 +279,19 @@ class DAINPredictor(BasePredictor):
for (h, hashed_paths) in hashes.items(): for (h, hashed_paths) in hashes.items():
if len(hashed_paths) > 1: if len(hashed_paths) > 1:
for p in hashed_paths[1:]: first_index = int(hashed_paths[0].split('/')[-1].split('.')[-2])
os.remove(p) last_index = int(
hashed_paths[-1].split('/')[-1].split('.')[-2]) + 1
frames = sorted(glob.glob(os.path.join(paths, '*.png'))) gap = 2 * (last_index - first_index) - 1
for fid, frame in enumerate(frames): if gap > 9:
new_name = '{:08d}'.format(fid) + '.png' mid = len(hashed_paths) // 2
new_name = os.path.join(paths, new_name) for p in hashed_paths[1:mid - 1]:
os.rename(frame, new_name) os.remove(p)
for p in hashed_paths[mid + 1:]:
os.remove(p)
else:
for p in hashed_paths[1:]:
os.remove(p)
frames = sorted(glob.glob(os.path.join(paths, '*.png'))) frames = sorted(glob.glob(os.path.join(paths, '*.png')))
return frames return frames
...@@ -26,17 +26,27 @@ from ppgan.models.generators.deoldify import build_model ...@@ -26,17 +26,27 @@ from ppgan.models.generators.deoldify import build_model
from .base_predictor import BasePredictor from .base_predictor import BasePredictor
DEOLDIFY_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/applications/DeOldify_stable.pdparams' DEOLDIFY_STABLE_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/applications/DeOldify_stable.pdparams'
DEOLDIFY_ART_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/applications/DeOldify_art.pdparams'
class DeOldifyPredictor(BasePredictor): class DeOldifyPredictor(BasePredictor):
def __init__(self, output='output', weight_path=None, render_factor=32): def __init__(self,
# self.input = input output='output',
weight_path=None,
artistic=False,
render_factor=32):
self.output = os.path.join(output, 'DeOldify') self.output = os.path.join(output, 'DeOldify')
if not os.path.exists(self.output):
os.makedirs(self.output)
self.render_factor = render_factor self.render_factor = render_factor
self.model = build_model() self.model = build_model(
model_type='artistic' if artistic else 'stable')
if weight_path is None: if weight_path is None:
weight_path = get_path_from_url(DEOLDIFY_WEIGHT_URL) if artistic:
weight_path = get_path_from_url(DEOLDIFY_ART_WEIGHT_URL)
else:
weight_path = get_path_from_url(DEOLDIFY_STABLE_WEIGHT_URL)
state_dict = paddle.load(weight_path) state_dict = paddle.load(weight_path)
self.model.load_dict(state_dict) self.model.load_dict(state_dict)
...@@ -134,7 +144,10 @@ class DeOldifyPredictor(BasePredictor): ...@@ -134,7 +144,10 @@ class DeOldifyPredictor(BasePredictor):
out_path = None out_path = None
if self.output: if self.output:
base_name = os.path.splitext(os.path.basename(input))[0] try:
base_name = os.path.splitext(os.path.basename(input))[0]
except:
base_name = 'result'
out_path = os.path.join(self.output, base_name + '.png') out_path = os.path.join(self.output, base_name + '.png')
pred_img.save(out_path) pred_img.save(out_path)
......
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. # copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
...@@ -31,6 +31,7 @@ from ppgan.utils.filesystem import load ...@@ -31,6 +31,7 @@ from ppgan.utils.filesystem import load
from ppgan.engine.trainer import Trainer from ppgan.engine.trainer import Trainer
from ppgan.models.builder import build_model from ppgan.models.builder import build_model
from ppgan.utils.preprocess import * from ppgan.utils.preprocess import *
from .base_predictor import BasePredictor
def toImage(net_output): def toImage(net_output):
...@@ -52,14 +53,17 @@ def mask2image(mask: np.array, format="HWC"): ...@@ -52,14 +53,17 @@ def mask2image(mask: np.array, format="HWC"):
return canvas return canvas
PS_WEIGHT_URL = "https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams"
class PreProcess: class PreProcess:
def __init__(self, config, need_parser=True): def __init__(self, config, need_parser=True):
self.img_size = 256 self.img_size = 256
self.transform = transform = T.Compose([ self.transform = transform = T.Compose([
T.Resize(size=256), T.Resize(size=256),
T.Permute(to_rgb=False), T.ToTensor(),
]) ])
self.norm = T.Normalize([127.5, 127.5, 127.5], [127.5, 127.5, 127.5]) self.norm = T.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
if need_parser: if need_parser:
self.face_parser = futils.mask.FaceParser() self.face_parser = futils.mask.FaceParser()
self.up_ratio = 0.6 / 0.85 self.up_ratio = 0.6 / 0.85
...@@ -82,8 +86,6 @@ class PreProcess: ...@@ -82,8 +86,6 @@ class PreProcess:
mask = cv2.resize(mask.numpy(), (self.img_size, self.img_size), mask = cv2.resize(mask.numpy(), (self.img_size, self.img_size),
interpolation=cv2.INTER_NEAREST) interpolation=cv2.INTER_NEAREST)
mask = mask.astype(np.uint8) mask = mask.astype(np.uint8)
mask_color = mask2image(mask)
cv2.imwrite('mask_temp.png', mask_color)
mask_tensor = paddle.to_tensor(mask) mask_tensor = paddle.to_tensor(mask)
lms = futils.dlib.landmarks(image, face) * self.img_size / image.width lms = futils.dlib.landmarks(image, face) * self.img_size / image.width
...@@ -97,7 +99,7 @@ class PreProcess: ...@@ -97,7 +99,7 @@ class PreProcess:
image = self.transform(np_image) image = self.transform(np_image)
return [ return [
self.norm(image), self.norm(image).unsqueeze(0),
np.float32(mask_aug), np.float32(mask_aug),
np.float32(P_np), np.float32(P_np),
np.float32(mask) np.float32(mask)
...@@ -145,11 +147,12 @@ class Inference: ...@@ -145,11 +147,12 @@ class Inference:
if with_face: if with_face:
return None, None return None, None
return return
for i in range(len(source_input) - 1):
for i in range(1, len(source_input) - 1):
source_input[i] = paddle.to_tensor( source_input[i] = paddle.to_tensor(
np.expand_dims(source_input[i], 0)) np.expand_dims(source_input[i], 0))
for i in range(len(reference_input) - 1): for i in range(1, len(reference_input) - 1):
reference_input[i] = paddle.to_tensor( reference_input[i] = paddle.to_tensor(
np.expand_dims(reference_input[i], 0)) np.expand_dims(reference_input[i], 0))
...@@ -163,10 +166,9 @@ class Inference: ...@@ -163,10 +166,9 @@ class Inference:
'consis_mask': consis_mask 'consis_mask': consis_mask
} }
state_dicts = load(self.model_path) state_dicts = load(self.model_path)
net = getattr(self.model, 'netG') for net_name, net in self.model.nets.items():
net.set_dict(state_dicts['netG']) net.set_state_dict(state_dicts[net_name])
result, _ = self.model.test(input_data) result, _ = self.model.test(input_data)
print('result shape: ', result.shape)
min_, max_ = result.min(), result.max() min_, max_ = result.min(), result.max()
result += -min_ result += -min_
result = paddle.divide(result, max_ - min_ + 1e-5) result = paddle.divide(result, max_ - min_ + 1e-5)
...@@ -174,38 +176,42 @@ class Inference: ...@@ -174,38 +176,42 @@ class Inference:
if with_face: if with_face:
return img, crop_face return img, crop_face
img.save('before.png')
return img return img
def main(args, cfg, save_path='transferred_image.png'): class PSGANPredictor(BasePredictor):
def __init__(self, args, cfg, output_path='output'):
setup(args, cfg) self.args = args
self.cfg = cfg
inference = Inference(cfg, args.model_path) self.weight_path = self.args.model_path
postprocess = PostProcess(cfg) if self.weight_path is None:
cur_path = os.path.abspath(os.path.dirname(__file__))
source = Image.open(args.source_path).convert("RGB") self.weight_path = get_path_from_url(PS_WEIGHT_URL, cur_path)
reference_paths = list(Path(args.reference_dir).glob("*")) self.output_path = output_path
np.random.shuffle(reference_paths)
for reference_path in reference_paths: def run(self):
if not reference_path.is_file(): setup(self.args, self.cfg)
print(reference_path, "is not a valid file.") inference = Inference(self.cfg, self.weight_path)
continue postprocess = PostProcess(self.cfg)
reference = Image.open(reference_path).convert("RGB") source = Image.open(self.args.source_path).convert("RGB")
reference_paths = list(Path(self.args.reference_dir).glob("*"))
# Transfer the psgan from reference to source. np.random.shuffle(reference_paths)
image, face = inference.transfer(source, reference, with_face=True) for reference_path in reference_paths:
image.save('before.png') if not reference_path.is_file():
source_crop = source.crop( print(reference_path, "is not a valid file.")
(face.left(), face.top(), face.right(), face.bottom())) continue
image = postprocess(source_crop, image)
image.save(save_path) reference = Image.open(reference_path).convert("RGB")
# Transfer the psgan from reference to source.
if __name__ == '__main__': image, face = inference.transfer(source, reference, with_face=True)
args = parse_args() source_crop = source.crop(
cfg = get_config(args.config_file) (face.left(), face.top(), face.right(), face.bottom()))
main(args, cfg) image = postprocess(source_crop, image)
ref_img_name = os.path.split(reference_path)[1]
save_path = os.path.join(self.output_path,
'transfered_ref_' + ref_img_name)
image.save(save_path)
...@@ -107,7 +107,10 @@ class RealSRPredictor(BasePredictor): ...@@ -107,7 +107,10 @@ class RealSRPredictor(BasePredictor):
out_path = None out_path = None
if self.output: if self.output:
base_name = os.path.splitext(os.path.basename(input))[0] try:
base_name = os.path.splitext(os.path.basename(input))[0]
except:
base_name = 'result'
out_path = os.path.join(self.output, base_name + '.png') out_path = os.path.join(self.output, base_name + '.png')
pred_img.save(out_path) pred_img.save(out_path)
......
...@@ -256,7 +256,7 @@ class Trainer: ...@@ -256,7 +256,7 @@ class Trainer:
assert name in ['checkpoint', 'weight'] assert name in ['checkpoint', 'weight']
state_dicts = {} state_dicts = {}
save_filename = 'epoch_%s_%s.pkl' % (epoch, name) save_filename = 'epoch_%s_%s.pdparams' % (epoch, name)
save_path = os.path.join(self.output_dir, save_filename) save_path = os.path.join(self.output_dir, save_filename)
for net_name, net in self.model.nets.items(): for net_name, net in self.model.nets.items():
state_dicts[net_name] = net.state_dict() state_dicts[net_name] = net.state_dict()
...@@ -275,7 +275,8 @@ class Trainer: ...@@ -275,7 +275,8 @@ class Trainer:
if keep > 0: if keep > 0:
try: try:
checkpoint_name_to_be_removed = os.path.join( checkpoint_name_to_be_removed = os.path.join(
self.output_dir, 'epoch_%s_%s.pkl' % (epoch - keep, name)) self.output_dir,
'epoch_%s_%s.pdparams' % (epoch - keep, name))
if os.path.exists(checkpoint_name_to_be_removed): if os.path.exists(checkpoint_name_to_be_removed):
os.remove(checkpoint_name_to_be_removed) os.remove(checkpoint_name_to_be_removed)
......
...@@ -23,7 +23,7 @@ from paddle.utils.download import get_path_from_url ...@@ -23,7 +23,7 @@ from paddle.utils.download import get_path_from_url
import pickle import pickle
from .model import BiSeNet from .model import BiSeNet
BISENET_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/bisnet.pdparams' BISENET_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/models/bisenet.pdparams'
class FaceParser: class FaceParser:
...@@ -65,7 +65,7 @@ class FaceParser: ...@@ -65,7 +65,7 @@ class FaceParser:
image = image.transpose((2, 0, 1)) image = image.transpose((2, 0, 1))
image = self.transforms(image) image = self.transforms(image)
state_dict, _ = paddle.load(self.save_pth) state_dict = paddle.load(self.save_pth)
self.net.set_dict(state_dict) self.net.set_dict(state_dict)
self.net.eval() self.net.eval()
...@@ -75,8 +75,6 @@ class FaceParser: ...@@ -75,8 +75,6 @@ class FaceParser:
out = self.net(image)[0] out = self.net(image)[0]
parsing = out.squeeze(0).argmax(0) #argmax(0).astype('float32') parsing = out.squeeze(0).argmax(0) #argmax(0).astype('float32')
#parsing = paddle.nn.functional.embedding(x=self.dict, weight=parsing)
parse_np = parsing.numpy() parse_np = parsing.numpy()
h, w = parse_np.shape h, w = parse_np.shape
result = np.zeros((h, w)) result = np.zeros((h, w))
......
...@@ -16,7 +16,7 @@ import numpy as np ...@@ -16,7 +16,7 @@ import numpy as np
import paddle import paddle
import paddle.nn as nn import paddle.nn as nn
import paddle.nn.functional as F import paddle.nn.functional as F
from paddle.vision.models import resnet101 from paddle.vision.models import resnet34, resnet101
from .hook import hook_outputs, model_sizes, dummy_eval from .hook import hook_outputs, model_sizes, dummy_eval
from ...modules.nn import Spectralnorm from ...modules.nn import Spectralnorm
...@@ -57,6 +57,7 @@ class Deoldify(SequentialEx): ...@@ -57,6 +57,7 @@ class Deoldify(SequentialEx):
def __init__(self, def __init__(self,
encoder, encoder,
n_classes, n_classes,
model_type='stable',
blur=False, blur=False,
blur_final=True, blur_final=True,
self_attention=False, self_attention=False,
...@@ -95,18 +96,34 @@ class Deoldify(SequentialEx): ...@@ -95,18 +96,34 @@ class Deoldify(SequentialEx):
do_blur = blur and (not_final or blur_final) do_blur = blur and (not_final or blur_final)
sa = self_attention and (i == len(sfs_idxs) - 3) sa = self_attention and (i == len(sfs_idxs) - 3)
n_out = nf if not_final else nf // 2 if model_type == 'stable':
n_out = nf if not_final else nf // 2
unet_block = UnetBlockWide(up_in_c, unet_block = UnetBlockWide(up_in_c,
x_in_c, x_in_c,
n_out, n_out,
self.sfs[i], self.sfs[i],
final_div=not_final, final_div=not_final,
blur=blur, blur=blur,
self_attention=sa, self_attention=sa,
norm_type=norm_type, norm_type=norm_type,
extra_bn=extra_bn, extra_bn=extra_bn,
**kwargs) **kwargs)
elif model_type == 'artistic':
unet_block = UnetBlockDeep(up_in_c,
x_in_c,
self.sfs[i],
final_div=not_final,
blur=blur,
self_attention=sa,
norm_type=norm_type,
extra_bn=extra_bn,
nf_factor=nf_factor,
**kwargs)
else:
raise ValueError(
'Expected model_type in [stable, artistic], but got {}'.
format(model_type))
unet_block.eval() unet_block.eval()
layers.append(unet_block) layers.append(unet_block)
x = unet_block(x) x = unet_block(x)
...@@ -151,7 +168,7 @@ def custom_conv_layer(ni: int, ...@@ -151,7 +168,7 @@ def custom_conv_layer(ni: int,
bn = norm_type in ('Batch', 'Batchzero') or extra_bn == True bn = norm_type in ('Batch', 'Batchzero') or extra_bn == True
if bias is None: if bias is None:
bias = not bn bias = not bn
conv_func = nn.Conv2DTranspose if transpose else nn.Conv1d if is_1d else nn.Conv2D conv_func = nn.Conv2DTranspose if transpose else nn.Conv1D if is_1d else nn.Conv2D
conv = conv_func(ni, conv = conv_func(ni,
nf, nf,
...@@ -222,19 +239,18 @@ class UnetBlockWide(nn.Layer): ...@@ -222,19 +239,18 @@ class UnetBlockWide(nn.Layer):
class UnetBlockDeep(nn.Layer): class UnetBlockDeep(nn.Layer):
"A quasi-UNet block, using `PixelShuffle_ICNR upsampling`." "A quasi-UNet block, using `PixelShuffle_ICNR upsampling`."
def __init__( def __init__(self,
self, up_in_c: int,
up_in_c: int, x_in_c: int,
x_in_c: int, hook,
# hook: Hook, final_div: bool = True,
final_div: bool = True, blur: bool = False,
blur: bool = False, leaky: float = None,
leaky: float = None, self_attention: bool = False,
self_attention: bool = False, nf_factor: float = 1.0,
nf_factor: float = 1.0, **kwargs):
**kwargs):
super().__init__() super().__init__()
self.hook = hook
self.shuf = CustomPixelShuffle_ICNR(up_in_c, self.shuf = CustomPixelShuffle_ICNR(up_in_c,
up_in_c // 2, up_in_c // 2,
blur=blur, blur=blur,
...@@ -312,7 +328,7 @@ def conv_layer(ni: int, ...@@ -312,7 +328,7 @@ def conv_layer(ni: int,
if padding is None: padding = (ks - 1) // 2 if not transpose else 0 if padding is None: padding = (ks - 1) // 2 if not transpose else 0
bn = norm_type in ('Batch', 'BatchZero') bn = norm_type in ('Batch', 'BatchZero')
if bias is None: bias = not bn if bias is None: bias = not bn
conv_func = nn.Conv2DTranspose if transpose else nn.Conv1d if is_1d else nn.Conv2D conv_func = nn.Conv2DTranspose if transpose else nn.Conv1D if is_1d else nn.Conv2D
conv = conv_func(ni, conv = conv_func(ni,
nf, nf,
...@@ -472,16 +488,27 @@ def _get_sfs_idxs(sizes): ...@@ -472,16 +488,27 @@ def _get_sfs_idxs(sizes):
return sfs_idxs return sfs_idxs
def build_model(): def build_model(model_type='stable'):
backbone = resnet101() if model_type == 'stable':
backbone = resnet101()
nf_factor = 2
elif model_type == 'artistic':
backbone = resnet34()
nf_factor = 1.5
else:
raise ValueError(
'Expected model_type in [stable, artistic], but got {}'.format(
model_type))
cut = -2 cut = -2
encoder = nn.Sequential(*list(backbone.children())[:cut]) encoder = nn.Sequential(*list(backbone.children())[:cut])
model = Deoldify(encoder, model = Deoldify(encoder,
3, 3,
model_type=model_type,
blur=True, blur=True,
y_range=(-3, 3), y_range=(-3, 3),
norm_type='Spectral', norm_type='Spectral',
self_attention=True, self_attention=True,
nf_factor=2) nf_factor=nf_factor)
return model return model
...@@ -296,31 +296,65 @@ class MANet(paddle.nn.Layer): ...@@ -296,31 +296,65 @@ class MANet(paddle.nn.Layer):
# x -> src img # x -> src img
x = self.encoder(x) x = self.encoder(x)
_, c, h, w = x.shape _, c, h, w = x.shape
x_flat = x.reshape([-1, c, h * w])
x_flat = self.w * x_flat
if x_p is not None:
x_flat = paddle.concat([x_flat, x_p], axis=1)
_, c2, h2, w2 = y.shape _, c2, h2, w2 = y.shape
y_flat = y.reshape([-1, c2, h2 * w2])
y_flat = self.w * y_flat mask_x = F.interpolate(mask_x, size=(64, 64))
if y_p is not None: mask_x = mask_x.transpose((1, 0, 2, 3))
y_flat = paddle.concat([y_flat, y_p], axis=1) mask_x_re = mask_x.tile([1, x.shape[1], 1, 1])
a_ = paddle.matmul(x_flat, y_flat, transpose_x=True) * 200.0 mask_x_diff_re = mask_x.tile([1, x_p.shape[1], 1, 1])
mask_y = F.interpolate(mask_y, size=(64, 64))
# mask softmax mask_y = mask_y.transpose((1, 0, 2, 3))
if consistency_mask is not None: mask_y_re = mask_y.tile([1, y.shape[1], 1, 1])
a_ = a_ - 100.0 * (1 - consistency_mask) mask_y_diff_re = mask_y.tile([1, y_p.shape[1], 1, 1])
x_re = x.tile([3, 1, 1, 1])
y_re = y.tile([3, 1, 1, 1])
x_flat = x_re * mask_x_re
y_flat = y_re * mask_y_re
x_p = x_p.tile([3, 1, 1, 1]) * mask_x_diff_re
y_p = y_p.tile([3, 1, 1, 1]) * mask_y_diff_re
norm_x = paddle.norm(x_p, axis=1,
keepdim=True).tile([1, x_p.shape[1], 1, 1])
norm_x = paddle.where(norm_x == 0, paddle.to_tensor(1e10), norm_x)
x_p = x_p / norm_x
norm_y = paddle.norm(y_p, axis=1,
keepdim=True).tile([1, y_p.shape[1], 1, 1])
norm_y = paddle.where(norm_y == 0, paddle.to_tensor(1e10), norm_y)
y_p = y_p / norm_y
x_flat = paddle.concat([x_flat * 0.01, x_p], axis=1)
y_flat = paddle.concat([y_flat * 0.01, y_p], axis=1)
x_flat_re = x_flat.reshape([3, x_flat.shape[1], h * w])
y_flat_re = y_flat.reshape([3, y_flat.shape[1], h2 * w2])
a_ = paddle.matmul(x_flat_re, y_flat_re, transpose_x=True)
with paddle.no_grad():
a_mask = a_ != 0
a_ *= 200
a = F.softmax(a_, axis=-1) a = F.softmax(a_, axis=-1)
a = a * a_mask
gamma, beta = self.simple_spade(y) gamma, beta = self.simple_spade(y)
gamma = gamma.tile([3, 1, 1, 1]) * mask_y
beta = beta.tile([3, 1, 1, 1]) * mask_y
beta = beta.reshape([-1, h2 * w2, 1]) beta = beta.reshape([-1, h2 * w2, 1])
beta = paddle.matmul(a, beta) beta = paddle.matmul(a, beta)
beta = beta.transpose((0, 2, 1))
beta = beta.reshape([-1, 1, h2, w2]) beta = beta.reshape([-1, 1, h2, w2])
gamma = gamma.reshape([-1, h2 * w2, 1]) gamma = gamma.reshape([-1, h2 * w2, 1])
gamma = paddle.matmul(a, gamma) gamma = paddle.matmul(a, gamma)
gamma = gamma.transpose((0, 2, 1))
gamma = gamma.reshape([-1, 1, h2, w2]) gamma = gamma.reshape([-1, 1, h2, w2])
beta = (beta[0] + beta[1] + beta[2]).unsqueeze(0)
gamma = (gamma[0] + gamma[1] + gamma[2]).unsqueeze(0)
x = x * (1 + gamma) + beta x = x * (1 + gamma) + beta
for i in range(self.repeat_num): for i in range(self.repeat_num):
......
...@@ -323,9 +323,9 @@ class MakeupModel(BaseModel): ...@@ -323,9 +323,9 @@ class MakeupModel(BaseModel):
g_B_eye_loss_his = self.criterionL1(fake_B_eye_masked, fake_match_eye_B) g_B_eye_loss_his = self.criterionL1(fake_B_eye_masked, fake_match_eye_B)
self.loss_G_A_his = (g_A_eye_loss_his + g_A_lip_loss_his + self.loss_G_A_his = (g_A_eye_loss_his + g_A_lip_loss_his +
g_A_skin_loss_his * 0.1) * 0.01 g_A_skin_loss_his * 0.1) * 0.1
self.loss_G_B_his = (g_B_eye_loss_his + g_B_lip_loss_his + self.loss_G_B_his = (g_B_eye_loss_his + g_B_lip_loss_his +
g_B_skin_loss_his * 0.1) * 0.01 g_B_skin_loss_his * 0.1) * 0.1
self.losses['G_A_his_loss'] = self.loss_G_A_his self.losses['G_A_his_loss'] = self.loss_G_A_his
self.losses['G_B_his_loss'] = self.loss_G_A_his self.losses['G_B_his_loss'] = self.loss_G_A_his
...@@ -343,9 +343,9 @@ class MakeupModel(BaseModel): ...@@ -343,9 +343,9 @@ class MakeupModel(BaseModel):
self.loss_B_vgg = self.criterionL2(vgg_fake_B, self.loss_B_vgg = self.criterionL2(vgg_fake_B,
vgg_r) * lambda_B * lambda_vgg vgg_r) * lambda_B * lambda_vgg
self.loss_rec = (self.loss_cycle_A + self.loss_cycle_B + self.loss_rec = (self.loss_cycle_A * 0.2 + self.loss_cycle_B * 0.2 +
self.loss_A_vgg + self.loss_B_vgg) * 0.2 self.loss_A_vgg + self.loss_B_vgg) * 0.5
self.loss_idt = (self.loss_idt_A + self.loss_idt_B) * 0.2 self.loss_idt = (self.loss_idt_A + self.loss_idt_B) * 0.1
self.losses['G_A_vgg_loss'] = self.loss_A_vgg self.losses['G_A_vgg_loss'] = self.loss_A_vgg
self.losses['G_B_vgg_loss'] = self.loss_B_vgg self.losses['G_B_vgg_loss'] = self.loss_B_vgg
......
...@@ -80,11 +80,8 @@ class Pix2PixModel(BaseModel): ...@@ -80,11 +80,8 @@ class Pix2PixModel(BaseModel):
AtoB = self.cfg.dataset.train.direction == 'AtoB' AtoB = self.cfg.dataset.train.direction == 'AtoB'
# TODO: replace to_varialbe with to_tensor self.real_A = paddle.to_tensor(input['A' if AtoB else 'B'])
self.real_A = paddle.fluid.dygraph.to_variable( self.real_B = paddle.to_tensor(input['B' if AtoB else 'A'])
input['A' if AtoB else 'B'])
self.real_B = paddle.fluid.dygraph.to_variable(
input['B' if AtoB else 'A'])
self.image_paths = input['A_paths' if AtoB else 'B_paths'] self.image_paths = input['A_paths' if AtoB else 'B_paths']
......
...@@ -57,7 +57,7 @@ def parse_args(): ...@@ -57,7 +57,7 @@ def parse_args():
parser.add_argument("--reference_dir", parser.add_argument("--reference_dir",
default="", default="",
help="path to reference images") help="path to reference images")
parser.add_argument("--model_path", default="", help="model for loading") parser.add_argument("--model_path", default=None, help="model for loading")
args = parser.parse_args() args = parser.parse_args()
......
...@@ -30,11 +30,9 @@ def generate_P_from_lmks(lmks, resize, w, h): ...@@ -30,11 +30,9 @@ def generate_P_from_lmks(lmks, resize, w, h):
diff = fix - lmks diff = fix - lmks
diff = diff.transpose(1, 2, 0) diff = diff.transpose(1, 2, 0)
diff = cv2.resize(diff, diff_size, interpolation=cv2.INTER_NEAREST) diff = cv2.resize(diff, diff_size, interpolation=cv2.INTER_NEAREST)
diff = diff.transpose(2, 0, 1).reshape(136, -1) diff = diff.transpose(2, 0, 1)
norm = np.linalg.norm(diff, axis=0)
P_np = diff / norm
return P_np return diff
def copy_area(tar, src, lms): def copy_area(tar, src, lms):
......
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
from ppgan.utils.options import parse_args
from ppgan.utils.config import get_config
from ppgan.apps.psgan_predictor import PSGANPredictor
if __name__ == '__main__':
args = parse_args()
cfg = get_config(args.config_file)
predictor = PSGANPredictor(args, cfg)
predictor.run()
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册