Merge branch 'master' of https://github.com/PaddlePaddle/PaddleGAN into release/0.1.0

ce38ba13 · LielinJiang · dc7d1f10 · 9c673e6b · ce38ba13 · ce38ba13
17 changed file
--- a/README.md
+++ b/README.md
@@ -17,14 +17,14 @@ GAN--生成对抗网络，被“卷积网络之父”**Yann LeCun（杨立昆）
 * 请确保您按照[安装文档](./docs/zh_CN/install.md)的说明正确安装了PaddlePaddle和PaddleGAN
-* 通过ppgan.app接口使用预训练模型:
+* 通过ppgan.apps接口使用预训练模型:
 ```python
 from ppgan.apps import RealSRPredictor
 sr = RealSRPredictor()
 sr.run("docs/imgs/monarch.png")
 ```
+* 更多预训练模型的使用请参考[ppgan.apps apis](./docs/zh_CN/apis/apps.md)
 * 更多训练、评估教程:
  * [数据准备](./docs/zh_CN/data_prepare.md)
  * [训练/评估/推理教程](./docs/zh_CN/get_started.md)
@@ -35,6 +35,7 @@ GAN--生成对抗网络，被“卷积网络之父”**Yann LeCun（杨立昆）
 * [CycleGAN](./docs/zh_CN/tutorials/pix2pix_cyclegan.md)
 * [PSGAN](./docs/zh_CN/tutorials/psgan.md)
 * [First Order Motion Model](./docs/zh_CN/tutorials/motion_driving.md)
+* [FaceParsing](./docs/zh_CN/tutorials/face_parse.md)
 ## 复合应用
@@ -90,19 +91,20 @@ GAN--生成对抗网络，被“卷积网络之父”**Yann LeCun（杨立昆）
  - 初版发布，支持Pixel2Pixel、CycleGAN、PSGAN模型，支持视频插针、超分、老照片/视频上色、视频动作生成等应用。
  - 模块化设计，接口简单易用。
-## PaddleGAN 特别兴趣小组（Special Interest Group）
+## 欢迎加入PaddleGAN技术交流群
-最早于1961年被[ACM（Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery)首次提出并使用，国际顶尖开源组织包括[Kubernates](https://kubernetes.io/)都采用SIGs的形式，使拥有同样特定兴趣的成员可以共同分享、学习知识并进行项目开发。这些成员不需要在同一国家/地区、同一个组织，只要大家志同道合，都可以奔着相同的目标一同学习、工作、玩耍~
-PaddleGAN SIG就是这样一个汇集对GAN感兴趣小伙伴们的开发者组织，在这里，有百度飞桨的一线开发人员、有来自世界500强的资深工程师、有国内外顶尖高校的学生。
-我们正在持续招募有兴趣、有能力的开发者加入我们一起共同建设本项目，并一起探索更多有用、有趣的应用。
-[PaddleGAN QQ 群号：1058398620]
+扫描二维码加入PaddleGAN QQ群[群号：1058398620]，获得更高效的问题答疑，与各行业开发者交流讨论，我们期待您的加入！
 <div align='center'>
  <img src='./docs/imgs/qq.png'width='250' height='300'/>
 </div>
+### PaddleGAN 特别兴趣小组（Special Interest Group）
+最早于1961年被[ACM（Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery)首次提出并使用，国际顶尖开源组织包括[Kubernates](https://kubernetes.io/)都采用SIGs的形式，使拥有同样特定兴趣的成员可以共同分享、学习知识并进行项目开发。这些成员不需要在同一国家/地区、同一个组织，只要大家志同道合，都可以奔着相同的目标一同学习、工作、玩耍~
+PaddleGAN SIG就是这样一个汇集对GAN感兴趣小伙伴们的开发者组织，在这里，有百度飞桨的一线开发人员、有来自世界500强的资深工程师、有国内外顶尖高校的学生。
+我们正在持续招募有兴趣、有能力的开发者加入我们一起共同建设本项目，并一起探索更多有用、有趣的应用。欢迎大家在加入群后联系我们讨论加入SIG并参与共建事宜。
 ## 贡献代码

--- a/README_en.md
+++ b/README_en.md
@@ -24,7 +24,7 @@ GAN-Generative Adversarial Network, was praised by "the Father of Convolutional
 sr = RealSRPredictor()
 sr.run("docs/imgs/monarch.png")
 ```
+* More usage of pre-trained models, please refer to [ppgan.apps apis](./docs/en_US/apis/apps.md)
 * More tutorials:
  - [Data preparation](./docs/en_US/data_prepare.md)
  - [Training/Evaluating/Testing basic usage](./docs/zh_CN/get_started.md)
@@ -35,6 +35,7 @@ GAN-Generative Adversarial Network, was praised by "the Father of Convolutional
 * [CycleGAN](./docs/en_US/tutorials/pix2pix_cyclegan.md)
 * [PSGAN](./docs/en_US/tutorials/psgan.md)
 * [First Order Motion Model](./docs/en_US/tutorials/motion_driving.md)
+* [FaceParsing](./docs/en_US/tutorials/face_parse.md)
 ## Composite Application
@@ -79,20 +80,21 @@ GAN-Generative Adversarial Network, was praised by "the Father of Convolutional
  - Release first version, supported models include Pixel2Pixel, CycleGAN, PSGAN. Supported applications include video frame interpolation, super resolution, colorize images and videos, image animation.
  - Modular design and friendly interface.
-## PaddleGAN Special Interest Group（SIG）
+## Community
-It was first proposed and used by [ACM（Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery) in 1961. Top International open source organizations including [Kubernates](https://kubernetes.io/) all adopt the form of SIGs, so that members with the same specific interests can share, learn knowledge and develop projects. These members do not need to be in the same country/region or the same organization, as long as they are like-minded, they can all study, work, and play together with the same goals~
-PaddleGAN SIG is such a developer organization that brings together people who interested in GAN. There are frontline developers of PaddlePaddle, senior engineers from the world's top 500, and students from top universities at home and abroad.
-We are continuing to recruit developers interested and capable to join us building this project and explore more useful and interesting applications together.
-[PaddleGAN QQ Group：1058398620]
+Scan OR Code below to join [PaddleGAN QQ Group：1058398620], you can get offical technical support  here and communicate with other developers/friends. Look forward to your participation!
 <div align='center'>
  <img src='./docs/imgs/qq.png'width='250' height='300'/>
 </div>
+### PaddleGAN Special Interest Group（SIG）
+It was first proposed and used by [ACM（Association for Computing Machinery)](https://en.wikipedia.org/wiki/Association_for_Computing_Machinery) in 1961. Top International open source organizations including [Kubernates](https://kubernetes.io/) all adopt the form of SIGs, so that members with the same specific interests can share, learn knowledge and develop projects. These members do not need to be in the same country/region or the same organization, as long as they are like-minded, they can all study, work, and play together with the same goals~
+PaddleGAN SIG is such a developer organization that brings together people who interested in GAN. There are frontline developers of PaddlePaddle, senior engineers from the world's top 500, and students from top universities at home and abroad.
+We are continuing to recruit developers interested and capable to join us building this project and explore more useful and interesting applications together.
 ## Contributing

--- a/applications/tools/face_parse.py
+++ b/applications/tools/face_parse.py
+import argparse
+import paddle
+from ppgan.apps.face_parse_predictor import FaceParsePredictor
+parser = argparse.ArgumentParser()
+parser.add_argument("--input_image", type=str, help="path to source image")
+parser.add_argument("--cpu", dest="cpu", action="store_true", help="cpu mode.")
+if __name__ == "__main__":
+    args = parser.parse_args()
+    if args.cpu:
+        paddle.set_device('cpu')
+    predictor = FaceParsePredictor()
+    predictor.run(args.input_image)
--- a/docs/en_US/tutorials/face_parse.md
+++ b/docs/en_US/tutorials/face_parse.md
+# Face Parsing
+## 1. Face parsing introduction
+Face parsing address the task that how to parse facial components from face images. We utiize BiseNet to handle this problem and focus on computing the pixel-wise label map of a face image. It is useful for a variety of tasks, including recognition, animation, and synthesis.  This application is now working in our makeup transfer model.
+## 2. How to use
+### 2.1 Test
+Runing the following command to complete the face parsing task. The output results will be the segmanted face components mask for the input image.
+```
+cd applications
+python face_parse.py --input_image ../docs/imgs/face.png
+```
+**params:**
+- input_image: path of the input face image
+## Results
+![](../../imgs/face_parse_out.png)
+### 4. Reference
+```
+@misc{yu2018bisenet,
+      title={BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation},
+      author={Changqian Yu and Jingbo Wang and Chao Peng and Changxin Gao and Gang Yu and Nong Sang},
+      year={2018},
+      eprint={1808.00897},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```
--- a/docs/en_US/tutorials/psgan.md
+++ b/docs/en_US/tutorials/psgan.md
@@ -33,8 +33,8 @@ python tools/psgan_infer.py \
 2. Downloading the landmarks [data](https://paddlegan.bj.bcebos.com/landmarks.tar), and uncompress it
 3. Runnint the following command to substitute files:
 ```
-mv landmarks/makeup MT-Dataset/landmarks/makeup
+rm -rf MT-Dataset/landmarks/makeup && mv landmarks/makeup MT-Dataset/landmarks/
-mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup
+rm -rf MT-Dataset/landmarks/non-makeup && mv landmarks/non-makeup MT-Dataset/landmarks/
 cp landmarks/train_makeup.txt MT-Dataset/train_makeup.txt
 cp landmarks/train_non-makeup.txt MT-Dataset/train_non-makeup.txt
 ```

--- a/docs/imgs/face.png
+++ b/docs/imgs/face.png
--- a/docs/imgs/face_parse_out.png
+++ b/docs/imgs/face_parse_out.png
--- a/docs/zh_CN/apis/apps.md
+++ b/docs/zh_CN/apis/apps.md
 # Applications接口说明
-ppgan.apps包含超分、插针、上色、换妆、图像动画生成等应用，接口使用简洁，并内置了已训练好的模型，可以直接用来做应用。
+ppgan.apps包含超分、插针、上色、换妆、图像动画生成、人脸解析等应用，接口使用简洁，并内置了已训练好的模型，可以直接用来做应用。
 ## 公共用法
@@ -244,7 +244,7 @@ run(video_path)
 >
 > **返回值**
 >
-> > - tuple(str, str): 前者超分后的视频每帧图片的保存路径，后者为昨晚超分的视频路径。
+> > - tuple(str, str): 前者超分后的视频每帧图片的保存路径，后者为做完超分的视频路径。
@@ -254,7 +254,7 @@ run(video_path)
 ppgan.apps.DAINPredictor(output='output', weight_path=None，time_step=None, use_gpu=True, key_frame_thread=0，remove_duplicates=False)
 ```
-> 构建插针DAIN模型的实例。DAIN: Depth-Aware Video Frame Interpolation，论文链接: https://arxiv.org/abs/1904.00830 ，对视频做插针，获得帧率更高的视频。
+> 构建插帧DAIN模型的实例。DAIN: Depth-Aware Video Frame Interpolation，论文链接: https://arxiv.org/abs/1904.00830 ，对视频做插帧，获得帧率更高的视频。
 >
 > **示例**
 >
@@ -269,7 +269,7 @@ ppgan.apps.DAINPredictor(output='output', weight_path=None，time_step=None, use
 >
 > > - output_path (str):  设置预测输出的保存路径，默认是output。注意，保存路径为设置output/DAIN。
 > > - weight_path (str):  指定模型路径，默认是None，则会自动下载内置的已经训练好的模型。
-> > - time_step (float): 帧率变化的倍数为 1./time_step，例如，如果time_step为0.5，则2倍插针，为0.25，则为4倍插针。
+> > - time_step (float): 帧率变化的倍数为 1./time_step，例如，如果time_step为0.5，则2倍插针，为0.25，则为4倍插帧。
 > > - use_gpu (bool): 是否使用GPU做预测，默认是True。
 > > - remove_duplicates (bool): 是否去除重复帧，默认是False。
@@ -295,7 +295,7 @@ run(video_path)
 ppgan.apps.FirstOrderPredictor(output='output', weight_path=None，config=None, relative=False, adapt_scale=False，find_best_frame=False, best_frame=None)
 ```
-> 构建FirsrOrder模型的实例，此模型用来做Image Animation，既给定一张源图片和一个驱动视频，生成一段视频，其中住体是源图片，动作是驱动视频中的动作。论文是First Order Motion Model for Image Animation，论文链接: https://arxiv.org/abs/2003.00196 。
+> 构建FirsrOrder模型的实例，此模型用来做Image Animation，即给定一张源图片和一个驱动视频，生成一段视频，其中主体是源图片，动作是驱动视频中的动作。论文是First Order Motion Model for Image Animation，论文链接: https://arxiv.org/abs/2003.00196 。
 >
 > **示例**
 >
@@ -330,3 +330,24 @@ run(source_image，driving_video)
 > **返回值**
 >
 > > 无。
+## ppgan.apps.FaceParsePredictor
+```pyhton
+ppgan.apps.FaceParsePredictor(output_path='output')
+```
+> 构建人脸解析模型实例，此模型用来做人脸解析， 即给定一个输入的人脸图像，人脸解析将为每个语义成分(如头发、嘴唇、鼻子、耳朵等)分配一个像素级标签。我们用BiseNet来完成这项任务。论文是 BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, 论文链接: https://arxiv.org/abs/1808.00897v1.
+> **参数:**
+>
+> > - input_image: 输入待解析的图片文件路径
+> **示例:**
+>
+> ```
+> from ppgan.apps import FaceParsePredictor
+> parser = FaceParsePredictor()
+> parser.run('docs/imgs/face.png')
+> ```
+> **返回值:**
+> > - mask(numpy.ndarray): 返回解析完成的人脸成分mask矩阵, 数据类型为numpy.ndarray
--- a/docs/zh_CN/tutorials/face_parse.md
+++ b/docs/zh_CN/tutorials/face_parse.md
+# 人脸解析
+## 1. 人脸解析简介
+人脸解析是语义图像分割的一种特殊情况，人脸解析是计算人脸图像中不同语义成分(如头发、嘴唇、鼻子、眼睛等)的像素级标签映射。给定一个输入的人脸图像，人脸解析将为每个语义成分分配一个像素级标签。我们利用BiseNet来解决这个问题。人脸解析工具在很多任务中都有应用，如识别，动画以及合成等。这个工具我们目前应用在换妆模型上。
+## 2. 使用方法
+### 2.1 测试
+运行如下命令，可以完成人脸解析任务，程序运行成功后，会在`output`文件夹生成解析后的图片文件。具体命令如下所示：
+```
+cd applications
+python face_parse.py --input_image ../docs/imgs/face.png
+```
+**参数:**
+- input_image: 输入待解析的图片文件路径
+## 3. 结果展示
+![](../../imgs/face_parse_out.png)
+### 4. 参考文献
+```
+@misc{yu2018bisenet,
+      title={BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation},
+      author={Changqian Yu and Jingbo Wang and Chao Peng and Changxin Gao and Gang Yu and Nong Sang},
+      year={2018},
+      eprint={1808.00897},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```
--- a/docs/zh_CN/tutorials/psgan.md
+++ b/docs/zh_CN/tutorials/psgan.md
@@ -33,8 +33,8 @@ python tools/psgan_infer.py \
 2. 下载landmarks数据[lmks](https://paddlegan.bj.bcebos.com/landmarks.tar)，并解压
 3. 运行如下命令进行文件夹及文件替换:
 ```
-mv landmarks/makeup MT-Dataset/landmarks/makeup
+rm -rf MT-Dataset/landmarks/makeup && mv landmarks/makeup MT-Dataset/landmarks/
-mv landmarks/non-makeup MT-Dataset/landmarks/non-makeup
+rm -rf MT-Dataset/landmarks/non-makeup && mv landmarks/non-makeup MT-Dataset/landmarks/
 cp landmarks/train_makeup.txt MT-Dataset/train_makeup.txt
 cp landmarks/train_non-makeup.txt MT-Dataset/train_non-makeup.txt
 ```

--- a/ppgan/apps/__init__.py
+++ b/ppgan/apps/__init__.py
@@ -18,3 +18,4 @@ from .deoldify_predictor import DeOldifyPredictor
 from .realsr_predictor import RealSRPredictor
 from .edvr_predictor import EDVRPredictor
 from .first_order_predictor import FirstOrderPredictor
+from .face_parse_predictor import FaceParsePredictor
--- a/ppgan/apps/dain_predictor.py
+++ b/ppgan/apps/dain_predictor.py
@@ -269,6 +269,7 @@ class DAINPredictor(BasePredictor):
            return sum([2**i for (i, v) in enumerate(diff.flatten()) if v])
        hashes = {}
+        max_interp = 9
        image_paths = sorted(glob.glob(os.path.join(paths, '*.png')))
        for image_path in image_paths:
            image = cv2.imread(image_path)
@@ -283,7 +284,16 @@ class DAINPredictor(BasePredictor):
                last_index = int(
                    hashed_paths[-1].split('/')[-1].split('.')[-2]) + 1
                gap = 2 * (last_index - first_index) - 1
-                if gap > 9:
+                if gap > 2 * max_interp:
+                    cut1 = len(hashed_paths) // 3
+                    cut2 = cut1 * 2
+                    for p in hashed_paths[1:cut1 - 1]:
+                        os.remove(p)
+                    for p in hashed_paths[cut1 + 1:cut2]:
+                        os.remove(p)
+                    for p in hashed_paths[cut2 + 1:]:
+                        os.remove(p)
+                if gap > max_interp:
                    mid = len(hashed_paths) // 2
                    for p in hashed_paths[1:mid - 1]:
                        os.remove(p)

--- a/ppgan/apps/face_parse_predictor.py
+++ b/ppgan/apps/face_parse_predictor.py
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import sys
+import argparse
+from PIL import Image
+import numpy as np
+import cv2
+import ppgan.faceutils as futils
+from ppgan.utils.preprocess import *
+from ppgan.utils.visual import mask2image
+from .base_predictor import BasePredictor
+class FaceParsePredictor(BasePredictor):
+    def __init__(self, output_path='output'):
+        self.output_path = output_path
+        self.input_size = (512, 512)
+        self.up_ratio = 0.6 / 0.85
+        self.down_ratio = 0.2 / 0.85
+        self.width_ratio = 0.2 / 0.85
+        self.face_parser = futils.mask.FaceParser()
+    def run(self, image):
+        image = Image.open(image).convert("RGB")
+        face = futils.dlib.detect(image)
+        if not face:
+            return
+        face_on_image = face[0]
+        image, face, crop_face = futils.dlib.crop(image, face_on_image,
+                                                  self.up_ratio,
+                                                  self.down_ratio,
+                                                  self.width_ratio)
+        np_image = np.array(image)
+        mask = self.face_parser.parse(
+            np.float32(cv2.resize(np_image, self.input_size)))
+        mask = cv2.resize(mask.numpy(), (256, 256))
+        mask = mask.astype(np.uint8)
+        mask = mask2image(mask)
+        if not os.path.exists(output_path):
+            os.makedirs(output_path)
+        save_path = os.path.join(self.output_path, 'face_parse.png')
+        cv2.imwrite(save_path, mask)
+        return mask
--- a/ppgan/apps/psgan_predictor.py
+++ b/ppgan/apps/psgan_predictor.py
@@ -43,16 +43,6 @@ def toImage(net_output):
    return img
-def mask2image(mask: np.array, format="HWC"):
-    H, W = mask.shape
-    canvas = np.zeros((H, W, 3), dtype=np.uint8)
-    for i in range(int(mask.max())):
-        color = np.random.rand(1, 1, 3) * 255
-        canvas += (mask == i)[:, :, None] * color.astype(np.uint8)
-    return canvas
 PS_WEIGHT_URL = "https://paddlegan.bj.bcebos.com/models/psgan_weight.pdparams"
@@ -81,6 +71,7 @@ class PreProcess:
                                                  self.down_ratio,
                                                  self.width_ratio)
        np_image = np.array(image)
+        image_trans = self.transform(np_image)
        mask = self.face_parser.parse(
            np.float32(cv2.resize(np_image, (512, 512))))
        mask = cv2.resize(mask.numpy(), (self.img_size, self.img_size),
@@ -88,7 +79,8 @@ class PreProcess:
        mask = mask.astype(np.uint8)
        mask_tensor = paddle.to_tensor(mask)
-        lms = futils.dlib.landmarks(image, face) * self.img_size / image.width
+        lms = futils.dlib.landmarks(
+            image, face) / image_trans.shape[:2] * self.img_size
        lms = lms.round()
        P_np = generate_P_from_lmks(lms, self.img_size, self.img_size,
@@ -96,10 +88,8 @@ class PreProcess:
        mask_aug = generate_mask_aug(mask, lms)
-        image = self.transform(np_image)
        return [
-            self.norm(image).unsqueeze(0),
+            self.norm(image_trans).unsqueeze(0),
            np.float32(mask_aug),
            np.float32(P_np),
            np.float32(mask)
@@ -212,6 +202,9 @@ class PSGANPredictor(BasePredictor):
            image = postprocess(source_crop, image)
            ref_img_name = os.path.split(reference_path)[1]
+            if not os.path.exists(self.output_path):
+                os.makedirs(sefl.output_path)
            save_path = os.path.join(self.output_path,
                                     'transfered_ref_' + ref_img_name)
            image.save(save_path)
--- a/ppgan/models/makeup_model.py
+++ b/ppgan/models/makeup_model.py
@@ -11,6 +11,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import os
 import numpy as np
 import paddle
@@ -328,7 +329,7 @@ class MakeupModel(BaseModel):
                             g_B_skin_loss_his * 0.1) * 0.1
        self.losses['G_A_his_loss'] = self.loss_G_A_his
-        self.losses['G_B_his_loss'] = self.loss_G_A_his
+        self.losses['G_B_his_loss'] = self.loss_G_B_his
        #vgg loss
        vgg_s = self.vgg(self.real_A)

--- a/ppgan/utils/visual.py
+++ b/ppgan/utils/visual.py
@@ -55,3 +55,13 @@ def save_image(image_numpy, image_path, aspect_ratio=1.0):
    if aspect_ratio < 1.0:
        image_pil = image_pil.resize((int(h / aspect_ratio), w), Image.BICUBIC)
    image_pil.save(image_path)
+def mask2image(mask: np.array, format="HWC"):
+    H, W = mask.shape
+    canvas = np.zeros((H, W, 3), dtype=np.uint8)
+    for i in range(int(mask.max())):
+        color = np.random.rand(1, 1, 3) * 255
+        canvas += (mask == i)[:, :, None] * color.astype(np.uint8)
+    return canvas
--- a/tools/psgan_infer.py
+++ b/tools/psgan_infer.py
@@ -12,7 +12,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import os
+import sys
 import argparse
+cur_path = os.path.abspath(os.path.dirname(__file__))
+root_path = os.path.split(cur_path)[0]
+sys.path.append(root_path)
 from ppgan.utils.options import parse_args
 from ppgan.utils.config import get_config
 from ppgan.apps.psgan_predictor import PSGANPredictor