未验证 提交 1b84c4fb 编写于 作者: Q qingqing01 提交者: GitHub

Change docs to docs/en_US and docs/zh_CN (#63)

* Update README
* Change docs dir to en_US and zh_CN
* Add paper reference
* Fix link
上级 271d846e
......@@ -17,6 +17,7 @@
- id: detect-private-key
- id: check-symlinks
- id: check-added-large-files
args: ['--maxkb=700']
- repo: local
hooks:
......
......@@ -32,20 +32,21 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包.
## 安装
请参考[安装文档](./docs/install.md)来进行PaddlePaddle和ppgan的安装
请参考[安装文档](./docs/zh_CN/install.md)来进行PaddlePaddle和ppgan的安装
## 数据准备
请参考[数据准备](./docs/data_prepare.md) 来准备对应的数据.
请参考[数据准备](./docs/zh_CN/data_prepare.md) 来准备对应的数据.
## 快速开始
训练,预测,推理等请参考 [快速开始](./docs/get_started.md).
训练,预测,推理等请参考 [快速开始](./docs/zh_CN/get_started.md).
## 模型教程
* [Pixel2Pixel and CycleGAN](./docs/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/tutorials/psgan.md)
* [视频修复](./docs/tutorials/video_restore.md)
* [动作驱动](./docs/tutorials/motion_driving.md)
* [Pixel2Pixel](./docs/zh_CN/tutorials/pix2pix_cyclegan.md)
* [CycleGAN](./docs/zh_CN/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/zh_CN/tutorials/psgan.md)
* [First Order Motion Model](./docs/zh_CN/tutorials/motion_driving.md)
* [视频修复](./docs/zh_CN/tutorials/video_restore.md)
## 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
......@@ -53,11 +54,4 @@ PaddleGAN 是一个基于飞桨的生成对抗网络开发工具包.
## 贡献代码
我们非常欢迎你可以为PaddleGAN提供任何贡献和建议。大多数贡献都需要你同意参与者许可协议(CLA)。当你提交拉取请求时,CLA机器人会自动检查你是否需要提供CLA。 只需要按照机器人提供的说明进行操作即可。CLA只需要同意一次,就能应用到所有的代码仓库上。关于更多的流程请参考[贡献指南](docs/CONTRIBUTE.md)
## 外部项目
外部基于飞桨的生成对抗网络模型
+ [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)
我们非常欢迎您可以为PaddleGAN提供任何贡献和建议。大多数贡献都需要同意参与者许可协议(CLA)。当提交拉取请求时,CLA机器人会自动检查您是否需要提供CLA。 只需要按照机器人提供的说明进行操作即可。CLA只需要同意一次,就能应用到所有的代码仓库上。关于更多的流程请参考[贡献指南](docs/zh_CN/contribute.md)
......@@ -35,19 +35,20 @@ changes.
## Install
Please refer to [install](./docs/install_en.md).
Please refer to [install](./docs/en_US/install.md).
## Data Prepare
Please refer to [data prepare](./docs/data_prepare_en.md) for dataset preparation.
Please refer to [data prepare](./docs/en_US/data_prepare.md) for dataset preparation.
## Get Start
Please refer [get started](./docs/get_started_en.md) for the basic usage of PaddleGAN.
Please refer [get started](./docs/en_US/get_started.md) for the basic usage of PaddleGAN.
## Model tutorial
* [Pixel2Pixel and CycleGAN](./docs/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/tutorials/psgan_en.md)
* [Video restore](./docs/tutorails/video_restore.md)
* [Motion driving](./docs/tutorials/motion_driving_en.md)
* [Pixel2Pixel](./docs/en_US/tutorials/pix2pix_cyclegan.md)
* [CycleGAN](./docs/en_US/tutorials/pix2pix_cyclegan.md)
* [PSGAN](./docs/en_US/tutorials/psgan.md)
* [First Order Motion Model](./docs/en_US/tutorials/motion_driving.md)
* [Video restore](./docs/zh_CN/tutorials/video_restore.md)
## License
PaddleGAN is released under the [Apache 2.0 license](LICENSE).
......@@ -56,11 +57,4 @@ PaddleGAN is released under the [Apache 2.0 license](LICENSE).
Contributions and suggestions are highly welcomed. Most contributions require you to agree to a [Contributor License Agreement (CLA)](https://cla-assistant.io/PaddlePaddle/PaddleGAN) declaring.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA. Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
For more, please reference [contribution guidelines](docs/CONTRIBUTE.md).
## External Projects
External gan projects in the community that base on PaddlePaddle:
+ [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)
For more, please reference [contribution guidelines](docs/en_US/contribute.md).
../../zh_CN/apis/apps.md
\ No newline at end of file
......@@ -88,5 +88,3 @@ facades
├── train
└── val
```
![](./imgs/1.jpg)
# Fist order motion model
## 1. First order motion model introduction
## First order motion model introduction
First order motion model is to complete the Image animation task, which consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video. The first order motion framework addresses this problem without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), this method can be applied to any object of this class. To achieve this, the innovative method decouple appearance and motion information using a self-supervised formulation. In addition, to support complex motions, it use a representation consisting of a set of learned keypoints along with their local affine transformations. A generator network models occlusions arising during target motions and combines the appearance extracted from the source image and the motion derived from the driving video.
![](../imgs/fom_demo.png)
[First order motion model](https://arxiv.org/abs/2003.00196) is to complete the Image animation task, which consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video. The first order motion framework addresses this problem without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), this method can be applied to any object of this class. To achieve this, the innovative method decouple appearance and motion information using a self-supervised formulation. In addition, to support complex motions, it use a representation consisting of a set of learned keypoints along with their local affine transformations. A generator network models occlusions arising during target motions and combines the appearance extracted from the source image and the motion derived from the driving video.
<div align="center">
<img src="../../imgs/fom_demo.png" width="500"/>
</div>
## How to use
Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file.
`python -u tools/first-order-demo.py --driving_video ./ravel_10.mp4 --source_image ./sudaqiang.png --relative --adapt_scale`
```
python -u tools/first-order-demo.py --driving_video ./ravel_10.mp4 --source_image ./sudaqiang.png --relative --adapt_scale
```
**params:**
- driving_video: driving video, the motion of the driving video is to be migrated.
......@@ -17,5 +22,17 @@ Users can upload the prepared source image and driving video, then substitute th
- relative: indicate whether the relative or absolute coordinates of the key points in the video are used in the program. It is recommended to use relative coordinates. If absolute coordinates are used, the characters will be distorted after animation.
- adapt_scale: adapt movement scale based on convex hull of keypoints.
## 3. Animation results
![](../imgs/first_order.gif)
## Animation results
![](../../imgs/first_order.gif)
## Reference
@InProceedings{Siarohin_2019_NeurIPS,
author={Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu},
title={First Order Motion Model for Image Animation},
booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
month = {December},
year = {2019}
}
......@@ -3,7 +3,7 @@
## 1.1 Principle
Pix2pix uses paired images for image translation, which has two different styles of the same image as input, can be used for style transfer. Pix2pix is encouraged by cGAN, cGAN inputs a noisy image and a condition as the supervision information to the generation network, pix2pix uses another style of image as the supervision information input into the generation network, so the fake image is related to another style of image which is input as supervision information, thus realizing the process of image translation.
## 1.2 How to use
### 1.2.1 Prepare Datasets
......@@ -94,5 +94,24 @@
# References
1. [Image-to-Image Translation with Conditional Adversarial Networks](https://arxiv.org/abs/1611.07004)
2. [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
- 1. [Image-to-Image Translation with Conditional Adversarial Networks](https://arxiv.org/abs/1611.07004)
```
@inproceedings{isola2017image,
title={Image-to-Image Translation with Conditional Adversarial Networks},
author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
booktitle={Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on},
year={2017}
}
```
- 2. [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
```
@inproceedings{CycleGAN2017,
title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
year={2017}
}
```
# PSGAN
## 1. PSGAN introduction
This paper is to address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image. Existing methods have achieved promising progress in constrained scenarios, but transferring between images with large pose and expression differences is still challenging. To address these issues, we propose Pose and expression robust Spatial-aware GAN (PSGAN). It first utilizes Makeup Distill Network to disentangle the makeup of the reference image as two spatial-aware makeup matrices. Then, Attentive Makeup Morphing module is introduced to specify how the makeup of a pixel in the source image is morphed from the reference image. With the makeup matrices and the source image, Makeup Apply Network is used to perform makeup transfer.
![](../imgs/psgan_arc.png)
This paper is to address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image. Existing methods have achieved promising progress in constrained scenarios, but transferring between images with large pose and expression differences is still challenging. To address these issues, we propose Pose and expression robust Spatial-aware GAN ([PSGAN](https://arxiv.org/abs/1909.06956)). It first utilizes Makeup Distill Network to disentangle the makeup of the reference image as two spatial-aware makeup matrices. Then, Attentive Makeup Morphing module is introduced to specify how the makeup of a pixel in the source image is morphed from the reference image. With the makeup matrices and the source image, Makeup Apply Network is used to perform makeup transfer.
<div align="center">
<img src="../../imgs/psgan_arc.png" width="800"/>
</div>
## 2. How to use
### 2.1 Test
......@@ -69,8 +74,23 @@ The training log looks like:
Notation: In train phase, the `isTrain` value in makeup.yaml file is `True`, but in test phase, its value should be modified as `False`.
### 2.3 Model
Model|Dataset|BatchSize|Inference speed|Download
---|:--:|:--:|:--:|:--:
PSGAN|MT-Dataset| 1 | 1.9s(GPU:P40) | [model]()
PSGAN|MT-Dataset| 1 | 1.9s/image (GPU:P40) | [model]()
## 3. Result
![](../imgs/makeup_shifter.png)
![](../../imgs/makeup_shifter.png)
### 4. References
```
@InProceedings{Jiang_2020_CVPR,
author = {Jiang, Wentao and Liu, Si and Gao, Chen and Cao, Jie and He, Ran and Feng, Jiashi and Yan, Shuicheng},
title = {PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
```
../en_US/contribute.md
\ No newline at end of file
......@@ -88,5 +88,3 @@ facades
├── train
└── val
```
![](./imgs/1.jpg)
# First order motion model
## 1. First order motion model原理
## First order motion model原理
First order motion model的任务是image animation,给定一张源图片,给定一个驱动视频,生成一段视频,其中主角是源图片,动作是驱动视频中的动作。如下图所示,源图像通常包含一个主体,驱动视频包含一系列动作。
![](../imgs/fom_demo.png)
<div align="center">
<img src="../../imgs/fom_demo.png" width="500"/>
</div>
以左上角的人脸表情迁移为例,给定一个源人物,给定一个驱动视频,可以生成一个视频,其中主体是源人物,视频中源人物的表情是由驱动视频中的表情所确定的。通常情况下,我们需要对源人物进行人脸关键点标注、进行表情迁移的模型训练。
但是这篇文章提出的方法只需要在同类别物体的数据集上进行训练即可,比如实现太极动作迁移就用太极视频数据集进行训练,想要达到表情迁移的效果就使用人脸视频数据集voxceleb进行训练。训练好后,我们使用对应的预训练模型就可以达到前言中实时image animation的操作。
## 2. 使用方法
## 使用方法
用户可以上传自己准备的视频和图片,并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径,然后运行如下命令,就可以完成动作表情迁移,程序运行成功后,会在ouput文件夹生成名为result.mp4的视频文件,该文件即为动作迁移后的视频。本项目中提供了原始图片和驱动视频供展示使用。运行的命令如下所示:
`python -u tools/first-order-demo.py --driving_video ./ravel_10.mp4 --source_image ./sudaqiang.png --relative --adapt_scale`
```
python -u tools/first-order-demo.py --driving_video ./ravel_10.mp4 --source_image ./sudaqiang.png --relative --adapt_scale
```
**参数说明:**
- driving_video: 驱动视频,视频中人物的表情动作作为待迁移的对象
......@@ -23,5 +27,17 @@ First order motion model的任务是image animation,给定一张源图片,
- adapt_scale: 根据关键点凸包自适应运动尺度
## 3. 生成结果展示
![](../imgs/first_order.gif)
## 生成结果展示
![](../../imgs/first_order.gif)
## 参考文献
@InProceedings{Siarohin_2019_NeurIPS,
author={Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu},
title={First Order Motion Model for Image Animation},
booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
month = {December},
year = {2019}
}
......@@ -3,7 +3,7 @@
## 1.1 原理介绍
Pix2pix利用成对的图片进行图像翻译,即输入为同一张图片的两种不同风格,可用于进行风格迁移。Pix2pix是在cGAN的基础上进行改进的,cGAN的生成网络不仅会输入一个噪声图片,同时还会输入一个条件作为监督信息,pix2pix则是把另外一种风格的图像作为监督信息输入生成网络中,这样生成的fake图像就会和作为监督信息的另一种风格的图像相关,从而实现了图像翻译的过程。
![](../imgs/pix2pix.png)
![](../../imgs/pix2pix.png)
## 1.2 如何使用
......@@ -38,7 +38,7 @@
## 1.3 结果展示
![](../imgs/horse2zebra.png)
![](../../imgs/horse2zebra.png)
[模型下载](TODO)
......@@ -86,12 +86,32 @@
## 2.3 结果展示
![](../imgs/A2B.png)
![](../../imgs/A2B.png)
[模型下载](TODO)
# 参考:
1. [Image-to-Image Translation with Conditional Adversarial Networks](https://arxiv.org/abs/1611.07004)
2. [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
[Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
- 1. [Image-to-Image Translation with Conditional Adversarial Networks](https://arxiv.org/abs/1611.07004)
```
@inproceedings{isola2017image,
title={Image-to-Image Translation with Conditional Adversarial Networks},
author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
booktitle={Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on},
year={2017}
}
```
- 2. [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
```
@inproceedings{CycleGAN2017,
title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
year={2017}
}
```
# PSGAN
## 1. PSGAN原理
PSGAN模型的任务是妆容迁移, 即将任意参照图像上的妆容迁移到不带妆容的源图像上。很多人像美化应用都需要这种技术。近来的一些妆容迁移方法大都基于生成对抗网络(GAN)。它们通常采用 CycleGAN 的框架,并在两个数据集上进行训练,即无妆容图像和有妆容图像。但是,现有的方法存在一个局限性:只在正面人脸图像上表现良好,没有为处理源图像和参照图像之间的姿态和表情差异专门设计模块。PSGAN是一种全新的姿态稳健可感知空间的生生成对抗网络。PSGAN 主要分为三部分:妆容提炼网络(MDNet)、注意式妆容变形(AMM)模块和卸妆-再化妆网络(DRNet)。这三种新提出的模块能让 PSGAN 具备上述的完美妆容迁移模型所应具备的能力。
![](../imgs/psgan_arc.png)
[PSGAN](https://arxiv.org/abs/1909.06956)模型的任务是妆容迁移, 即将任意参照图像上的妆容迁移到不带妆容的源图像上。很多人像美化应用都需要这种技术。近来的一些妆容迁移方法大都基于生成对抗网络(GAN)。它们通常采用 CycleGAN 的框架,并在两个数据集上进行训练,即无妆容图像和有妆容图像。但是,现有的方法存在一个局限性:只在正面人脸图像上表现良好,没有为处理源图像和参照图像之间的姿态和表情差异专门设计模块。PSGAN是一种全新的姿态稳健可感知空间的生生成对抗网络。PSGAN 主要分为三部分:妆容提炼网络(MDNet)、注意式妆容变形(AMM)模块和卸妆-再化妆网络(DRNet)。这三种新提出的模块能让 PSGAN 具备上述的完美妆容迁移模型所应具备的能力。
<div align="center">
<img src="../../imgs/psgan_arc.png" width="800"/>
</div>
## 2. 使用方法
### 2.1 测试
......@@ -72,4 +76,18 @@ Model|Dataset|BatchSize|Inference speed|Download
PSGAN|MT-Dataset| 1 | 1.9s(GPU:P40) | [model]()
## 3. 妆容迁移结果展示
![](../imgs/makeup_shifter.png)
![](../../imgs/makeup_shifter.png)
### 4. 参考文献
```
@InProceedings{Jiang_2020_CVPR,
author = {Jiang, Wentao and Liu, Si and Gao, Chen and Cao, Jie and He, Ran and Feng, Jiashi and Yan, Shuicheng},
title = {PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
```
......@@ -14,7 +14,7 @@ python tools/video-enhance.py --input you_video_path.mp4 --proccess_order DAIN D
- `--proccess_order`: 调用的模型名字和顺序,比如输入为 `DAIN DeOldify EDVR`,则会顺序调用 `DAINPredictor` `DeOldifyPredictor` `EDVRPredictor`
#### 效果展示
![](../imgs/color_sr_peking.gif)
![](../../imgs/color_sr_peking.gif)
### 快速体验
......@@ -35,7 +35,7 @@ python tools/video-enhance.py --input you_video_path.mp4 --proccess_order DAIN D
### 补帧模型DAIN
DAIN 模型通过探索深度的信息来显式检测遮挡。并且开发了一个深度感知的流投影层来合成中间流。在视频补帧方面有较好的效果。
![](./imgs/dain_network.png)
![](../../imgs/dain_network.png)
```
ppgan.apps.DAINPredictor(
......@@ -54,7 +54,7 @@ ppgan.apps.DAINPredictor(
### 上色模型DeOldifyPredictor
DeOldify 采用自注意力机制的生成对抗网络,生成器是一个U-NET结构的网络。在图像的上色方面有着较好的效果。
![](./imgs/deoldify_network.png)
![](../../imgs/deoldify_network.png)
```
ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32)
......@@ -68,7 +68,7 @@ ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32
### 上色模型DeepRemasterPredictor
DeepRemaster 模型基于时空卷积神经网络和自注意力机制。并且能够根据输入的任意数量的参考帧对图片进行上色。
![](./imgs/remaster_network.png)
![](../../imgs/remaster_network.png)
```
ppgan.apps.DeepRemasterPredictor(
......@@ -89,7 +89,7 @@ ppgan.apps.DeepRemasterPredictor(
### 超分辨率模型RealSRPredictor
RealSR模型通过估计各种模糊内核以及实际噪声分布,为现实世界的图像设计一种新颖的真实图片降采样框架。基于该降采样框架,可以获取与真实世界图像共享同一域的低分辨率图像。并且提出了一个旨在提高感知度的真实世界超分辨率模型。对合成噪声数据和真实世界图像进行的大量实验表明,该模型能够有效降低了噪声并提高了视觉质量。
![](./imgs/realsr_network.png)
![](../../imgs/realsr_network.png)
```
ppgan.apps.RealSRPredictor(output='output', weight_path=None)
......@@ -104,7 +104,7 @@ EDVR模型提出了一个新颖的视频具有增强可变形卷积的还原框
EDVR模型是一个基于连续帧的超分模型,能够有效利用帧间的信息,速度比RealSR模型快。
![](./imgs/edvr_network.png)
![](../../imgs/edvr_network.png)
```
ppgan.apps.EDVRPredictor(output='output', weight_path=None)
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# import mmcv
import os
import cv2
import random
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册