未验证 提交 058faa87 编写于 作者: Y YixinKristy 提交者: GitHub

Add En docs and Update AIP instruction (#349)

* revise video restore docs

* delete photos

* Update README.md

* Update install.md

* update readme

* Update README.md

* Update get_started.md

* Update get_started.md

* Update readme en

* Update get_started.md

* Update photo_color_en.md
上级 edd5d853
## Getting started with PaddleGAN # Quick Start
Note: PaddleGAN is a PaddlePaddle Generative Adversarial Network (GAN) development kit that provides a high-performance replication of a variety of classical networks with applications covering a wide range of areas such as image generation, style migration, ainimation driving, image/video super resolution and colorization.
* Before starting to use PaddleGAN, please make sure you have read the [install document](./install_en.md), and prepare the dataset according to the [data preparation document](./data_prepare_en.md)
* The following tutorial uses the train and evaluate of the CycleGAN model on the Cityscapes dataset as an example
### Train This section will teach you how to quickly get started with PaddleGAN, using the train and evaluate of the CycleGAN model on the Cityscapes dataset as an example.
Note that all model configuration files in PaddleGAN are available at [. /PaddleGAN/configs](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/configs).
## Contents
* [Installation](#Installation)
* [Data preparation](#Data-preparation)
* [Training](#Trianing)
* [Single Card Training](#Single-Card-Training)
* [Parameters](#Parameters)
* [Visualize Training](#Visualize-Training)
* [Resume Training](#Resume-Training)
* [Multi-Card Training](#Multi-Card-Training)
* [Evaluation](#Evaluation)
## Installation
For installation and configuration of the runtime environment, please refer to the [installation documentation](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/en_US/install.md) to complete the installation of PaddlePaddle and PaddleGAN.
In this demo, it is assumed that the user cloned and placed the code of PaddleGAN in the '/home/paddle' directory. The user executes the command operations in the '/home/paddle/PaddleGAN' directory.
## Data preparation
Prepare the Cityscapes dataset according to the [data preparation](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/en_US/data_prepare.md).
Download the Cityscapes dataset to ~/.cache/ppgan and softlink to PaddleGAN/data/ using the following script.
#### Train with single gpu
``` ```
python -u tools/main.py --config-file configs/cyclegan_cityscapes.yaml python data/download_cyclegan_data.py --name cityscapes
``` ```
#### Args ## Trianing
### 1. Single Card Training
```
python -u tools/main.py --config-file configs/cyclegan_cityscapes.yaml
```
#### Parameters
- `--config-file (str)`: path of config file。 * `--config-file (str)`: path to the config file. This is the configuration file used here for CycleGAN training on the Cityscapes dataset.
* The output logs, weights, and visualization results are saved by default in `. /output_dir`, which can be modified by the `output_dir` parameter in the configuration file:
The output log, weight, and visualization result will be saved in ```./output_dir``` by default, which can be modified by the ```output_dir``` parameter in the config file:
``` ```
output_dir: output_dir output_dir: output_dir
``` ```
<div align='center'>
<img src='https://user-images.githubusercontent.com/48054808/122734130-65448b00-d2b0-11eb-9fc4-302f3e851115.png' width=60%>
</div>
* The saved folder will automatically generate a new directory based on the model name and timestamp, with the following directory example.
The saved folder will automatically generate a new directory based on the model name and timestamp. The directory example is as follows:
``` ```
output_dir output_dir
└── CycleGANModel-2020-10-29-09-21 └── CycleGANModel-2020-10-29-09-21
...@@ -47,31 +82,57 @@ output_dir ...@@ -47,31 +82,57 @@ output_dir
└── epoch002_rec_B.png └── epoch002_rec_B.png
``` ```
Also, you can add the parameter ```enable_visualdl: true``` in the configuration file, use [PaddlePaddle VisualDL](https://github.com/PaddlePaddle/VisualDL) record the metrics or images generated in the training process, and run the command to monitor the training process: #### Visualize Training
[VisualDL](https://github.com/PaddlePaddle/VisualDL) is a visual analysis tool developed for deep learning model development, providing real-time trend visualization of key metrics, sample training intermediate process visualization, network structure visualization, etc. It can visually show the relationship between the effects of super participant models and assist in efficient tuning.
Please make sure that you have installed [VisualDL](https://github.com/PaddlePaddle/VisualDL). Refer to the [VisualDL installation guide](https://github.com/PaddlePaddle/VisualDL/blob/develop/README.md#Installation).
Use the [VisualDL](https://github.com/PaddlePaddle/VisualDL) to record the metrics or images generated by the training process by adding the command `enable_visualdl: True` to the configuration file cyclegan_cityscapes.yaml, and run the corresponding command to monitor the training process in real time.
<div align='center'>
<img src='https://user-images.githubusercontent.com/48054808/122736527-b786ab80-d2b2-11eb-96f8-235f6bbfba5a.png' width=60%>
</div>
If you want to customize the content of the [VisualDL](https://github.com/PaddlePaddle/VisualDL) visualization, you can go to . /PaddleGAN/ppgan/engine/trainer.py.
Launch [VisualDL](https://github.com/PaddlePaddle/VisualDL) locally by:
``` ```
visualdl --logdir output_dir/CycleGANModel-2020-10-29-09-21/ visualdl --logdir output_dir/CycleGANModel-2020-10-29-09-21/
``` ```
#### Recovery of training Please refer to the [VisualDL User's Guide](https://github.com/PaddlePaddle/VisualDL/blob/develop/docs/components/README.md) for more guidance on how to start and use those visualization functions.
#### Resume Training
The checkpoint of the previous epoch is saved in `output_dir` by default during the training process to facilitate resuming the training.
In this demo, cyclegan's training will save checkpoint every five epochs by default, and if you want to change the number of epochs, you can go to the **config file to adjust the `interval` paramter**.
<div align='center'>
<img src='https://user-images.githubusercontent.com/48054808/122886954-fda34400-d372-11eb-91a0-cd0e8328335f.png' width=60%>
</div>
The checkpoint of the previous epoch will be saved by default during the training process to facilitate the recovery of training
``` ```
python -u tools/main.py --config-file configs/cyclegan_cityscapes.yaml --resume your_checkpoint_path python -u tools/main.py --config-file configs/cyclegan_cityscapes.yaml --resume your_checkpoint_path
``` ```
#### Args
- `--resume (str)`: path of checkpoint。 - `--resume (str)`: path of checkpoint。
#### Train with multiple gpus: ### 2. Multi-Card Training
``` ```
CUDA_VISIBLE_DEVICES=0,1 python -m paddle.distributed.launch tools/main.py --config-file configs/cyclegan_cityscapes.yaml CUDA_VISIBLE_DEVICES=0,1 python -m paddle.distributed.launch tools/main.py --config-file configs/cyclegan_cityscapes.yaml
``` ```
### evaluate ## Evaluation
``` ```
python tools/main.py --config-file configs/cyclegan_cityscapes.yaml --evaluate-only --load your_weight_path python tools/main.py --config-file configs/cyclegan_cityscapes.yaml --evaluate-only --load your_weight_path
``` ```
#### Args #### Args
- `--evaluate-only`: whether to evaluate only。 - `--evaluate-only`: If or not to make predictions only
- `--load (str)`: path of weight。 - `--load (str)`: path of the weight
# Image Colorization
PaddleGAN provides [DeOldify](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/apis/apps.md#ppganappsdeoldifypredictor) model for image colorization.
## DeOldifyPredictor
[DeOldify](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/apis/apps.md#ppganappsdeoldifypredictor) generates the adversarial network with a self-attentive mechanism. The generator is a U-NET structured network with better effects in image/video coloring.
<div align='center'>
<img src='https://user-images.githubusercontent.com/48054808/117925538-fd526a80-b329-11eb-8924-8f2614fcd9e6.png'>
</div>
### Parameters
- `output (str,可选的)`: path of the output folder, default value: `output`
- `weight_path (None, optional)`: path to load weights, if not set, the default weights will be downloaded locally from the cloud. Default value:`None`
- `artistic (bool)`: whether or not to use the "artistic" model. "Artistic" models are likely to produce some interesting colors, but with some burrs.
- `render_factor (int)`: This parameter will be multiplied by 16 and used as the resize value for the input frame. If the value is set to 32, the input frame will be resized to a size of (32 * 16, 32 * 16) and fed into the network.
### Usage
**1. API Prediction**
```
from ppgan.apps import DeOldifyPredictor
deoldify = DeOldifyPredictor()
deoldify.run("/home/aistudio/先烈.jpg") #原图片所在路径
```
*`run` interface is a common interface for images/videos, since the object here is an image, the interface of `run_image` is suitable.
[Complete API interface usage instructions]()
**2. Command-Line Prediction**
```
!python applications/tools/video-enhance.py --input /home/aistudio/先烈.jpg \ #Original image path
--process_order DeOldify \ #Order of processing of the original image
--output output_dir #Path of the final image
```
### Experience Online Projects
**1. [Old Beijing City Video Restoration](https://aistudio.baidu.com/aistudio/projectdetail/1161285)**
**2. [PaddleGAN ❤️ 520 Edition](https://aistudio.baidu.com/aistudio/projectdetail/1956943?channelType=0&channel=0)**
## Install PaddleGAN
### requirements
## Installation
This document contains how to install PaddleGAN and related dependencies. For more product overview, please refer to [README](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/README_en.md).
### Requirements
* PaddlePaddle >= 2.1.0 * PaddlePaddle >= 2.1.0
* Python >= 3.6 * Python >= 3.6
* CUDA >= 10.1 * CUDA >= 10.1
### 1. Install PaddlePaddle ### Install PaddlePaddle
``` ```
pip install -U paddlepaddle-gpu # CUDA10.1
python -m pip install paddlepaddle-gpu==2.1.0.post101 -f https://mirror.baidu.com/pypi/simple
# CPU
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
``` ```
Note: command above will install paddle with cuda10.2, if your installed cuda is different, please visit home page of [paddlepaddle](https://www.paddlepaddle.org.cn/install/quick) for more help. For more installation methods such as conda or source compilation installation methods, please refer to the [PaddlePaddle installation documentation](https://www.paddlepaddle.org.cn/documentation/docs/en/install/index_en.html).
### 2. Install paddleGAN Make sure that your PaddlePaddle is successfully installed in the required or higher version, and then please use the following command to verify.
#### 2.1 Install through pip ```
# verify that PaddlePaddle is installed successfully in your Python interpreter
>>> import paddle
>>> paddle.utils.run_check()
# Confirm PaddlePaddle version
python -c "import paddle; print(paddle.__version__)"
```
### Install PaddleGAN
#### 1. Install via PIP (only Python3 is available)
* Install
``` ```
# only support Python3
python3 -m pip install --upgrade ppgan python3 -m pip install --upgrade ppgan
``` ```
Download the examples and configuration files via cloning the source code: * Download the examples and configuration files via cloning the source code:
``` ```
git clone https://github.com/PaddlePaddle/PaddleGAN git clone https://github.com/PaddlePaddle/PaddleGAN
cd PaddleGAN cd PaddleGAN
``` ```
#### 2.2 Install through source code #### 2. Install via source code
``` ```
git clone https://github.com/PaddlePaddle/PaddleGAN git clone https://github.com/PaddlePaddle/PaddleGAN
cd PaddleGAN cd PaddleGAN
pip install -v -e . # or "python setup.py develop" pip install -v -e . # or "python setup.py develop"
# Install other dependencies
pip install -r requirements.txt
``` ```
### 4. Installation of other tools that may be used ### Other Third-Party Tool Installation
#### 4.1 ffmpeg #### 1. ffmpeg
If you need to use ppgan to handle video-related tasks, you need to install ffmpeg. It is recommended that you use [conda](https://docs.conda.io/en/latest/miniconda.html) to install: All tasks involving video require `ffmpeg` to be installed, here we recommend using conda
``` ```
conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
``` ```
#### 4.2 Visual DL #### 2. VisualDL
If you want to use [PaddlePaddle VisualDL](https://github.com/PaddlePaddle/VisualDL) to monitor the training process, Please install `VisualDL`(For more detail refer [here](./get_started.md)): If you want to use [PaddlePaddle VisualDL](https://github.com/PaddlePaddle/VisualDL) to visualize the training process, Please install `VisualDL`(For more detail refer [here](./get_started.md)):
``` ```
python -m pip install visualdl -i https://mirror.baidu.com/pypi/simple python -m pip install visualdl -i https://mirror.baidu.com/pypi/simple
``` ```
*Note: Only versions installed under Python 3 or higher are maintained by VisualDL officially.
# Applications接口说明 # 预测接口说明
ppgan.apps包含超分、插针、上色、换妆、图像动画生成、人脸解析等应用,接口使用简洁,并内置了已训练好的模型,可以直接用来做应用 PaddleGAN(ppgan.apps)提供超分、插帧、上色、换妆、图像动画生成、人脸解析等多种应用的预测API接口。接口内置训练好的高性能模型,支持用户进行灵活高效的应用推理
* 超分:
* [RealSR](#ppgan.apps.DeOldifyPredictor)
* [EDVR](#ppgan.apps.EDVRPredictor)
* 上色: * 上色:
* [DeOldify](#ppgan.apps.DeOldifyPredictor) * [DeOldify](#ppgan.apps.DeOldifyPredictor)
* [DeepRemaster](#ppgan.apps.DeepRemasterPredictor) * [DeepRemaster](#ppgan.apps.DeepRemasterPredictor)
* 超分:
* [RealSR](#ppgan.apps.RealSRPredictor)
* [EDVR](#ppgan.apps.EDVRPredictor)
* 插帧: * 插帧:
* [DAIN](#ppgan.apps.DAINPredictor) * [DAIN](#ppgan.apps.DAINPredictor)
* 图像动作驱动: * 图像动作驱动:
...@@ -24,17 +24,15 @@ ppgan.apps包含超分、插针、上色、换妆、图像动画生成、人脸 ...@@ -24,17 +24,15 @@ ppgan.apps包含超分、插针、上色、换妆、图像动画生成、人脸
### CPU和GPU的切换 ### CPU和GPU的切换
默认情况下,如果是GPU设备、并且安装了PaddlePaddle的GPU环境包,则默认使用GPU进行推理。否则,如果安装的是CPU环境包,则使用CPU进行推理。如果需要手动切换CPU、GPU,可以通过以下方式: 默认情况下,如果是GPU设备、并且安装了[PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/windows-pip.html)的GPU环境包,则默认使用GPU进行推理。否则,如果安装的是CPU环境包,则使用CPU进行推理。
如果需要手动切换CPU、GPU,可以通过以下方式:
``` ```
import paddle import paddle
paddle.set_device('cpu') paddle.set_device('cpu') #设置为CPU
#paddle.set_device('gpu') #paddle.set_device('gpu') #设置为GPU
# from ppgan.apps import DeOldifyPredictor
# deoldify = DeOldifyPredictor()
# deoldify.run("docs/imgs/test_old.jpeg")
``` ```
## ppgan.apps.DeOldifyPredictor ## ppgan.apps.DeOldifyPredictor
...@@ -43,7 +41,7 @@ paddle.set_device('cpu') ...@@ -43,7 +41,7 @@ paddle.set_device('cpu')
ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32) ppgan.apps.DeOldifyPredictor(output='output', weight_path=None, render_factor=32)
``` ```
> 构建DeOldify实例。DeOldify是一个基于GAN的老照片上色模型。该接口可以对图片或视频做上色。建议视频使用mp4格式。 > 构建DeOldify实例。DeOldify是一个基于GAN的影像上色模型。该接口支持对图片或视频上色。视频建议使用mp4格式。
> >
> **示例** > **示例**
> >
...@@ -71,11 +69,10 @@ run(input) ...@@ -71,11 +69,10 @@ run(input)
> **参数** > **参数**
> >
> > - input (str|np.ndarray|Image.Image): 输入的图片或视频文件。如果是图片,可以是图片的路径、np.ndarray、或PIL.Image类型。如果是视频,只能是视频文件路径。 > > - input (str|np.ndarray|Image.Image): 输入的图片或视频文件。如果是图片,可以是图片的路径、np.ndarray、或PIL.Image类型。如果是视频,只能是视频文件路径。
> >
> >
> **返回值** >**返回值**
> >
> > - tuple(pred_img(np.array), out_paht(str)): 当属输入时图片时,返回预测后的图片,类型PIL.Image,以及图片的保存的路径。 >> - tuple(pred_img(np.array), out_paht(str)): 当属输入时图片时,返回预测后的图片,类型PIL.Image,以及图片的保存的路径。
> > - tuple(frame_path(str), out_path(str)): 当输入为视频时,frame_path为视频每帧上色后保存的图片路径,out_path为上色后视频的保存路径。 > > - tuple(frame_path(str), out_path(str)): 当输入为视频时,frame_path为视频每帧上色后保存的图片路径,out_path为上色后视频的保存路径。
### run_image ### run_image
...@@ -89,11 +86,10 @@ run_image(img) ...@@ -89,11 +86,10 @@ run_image(img)
> **参数** > **参数**
> >
> > - img (str|np.ndarray|Image.Image): 输入图片,可以是图片的路径、np.ndarray、或PIL.Image类型。 > > - img (str|np.ndarray|Image.Image): 输入图片,可以是图片的路径、np.ndarray、或PIL.Image类型。
> >
> >
> **返回值** >**返回值**
> >
> > - pred_img(PIL.Image): 返回预测后的图片,为PIL.Image类型。 >> - pred_img(PIL.Image): 返回预测后的图片,为PIL.Image类型。
### run_video ### run_video
...@@ -119,7 +115,7 @@ run_video(video) ...@@ -119,7 +115,7 @@ run_video(video)
ppgan.apps.DeepRemasterPredictor(output='output', weight_path=None, colorization=False, reference_dir=None, mindim=360) ppgan.apps.DeepRemasterPredictor(output='output', weight_path=None, colorization=False, reference_dir=None, mindim=360)
``` ```
> 构建DeepRemasterPredictor实例。DeepRemaster是一个基于GAN的老照片/视频修复、上色模型,该模型可以提供一个参考色的图片作为输入。该接口目前只支持视频输入,建议使用mp4格式。 > 构建DeepRemasterPredictor实例。DeepRemaster是一个基于GAN的视频上色、修复模型,该模型可以提供一个参考色的图片作为输入。该接口目前只支持视频输入,建议使用mp4格式。
> >
> **示例** > **示例**
> >
...@@ -165,6 +161,8 @@ ppgan.apps.RealSRPredictor(output='output', weight_path=None) ...@@ -165,6 +161,8 @@ ppgan.apps.RealSRPredictor(output='output', weight_path=None)
> 构建RealSR实例。RealSR: Real-World Super-Resolution via Kernel Estimation and Noise Injection发表于CVPR 2020 Workshops的基于真实世界图像训练的超分辨率模型。此接口对输入图片或视频做4倍的超分辨率。建议视频使用mp4格式。 > 构建RealSR实例。RealSR: Real-World Super-Resolution via Kernel Estimation and Noise Injection发表于CVPR 2020 Workshops的基于真实世界图像训练的超分辨率模型。此接口对输入图片或视频做4倍的超分辨率。建议视频使用mp4格式。
> >
> *注意:RealSR的输入图片尺寸需小于1000x1000pix。
>
> **用例** > **用例**
> >
> ``` > ```
...@@ -187,11 +185,10 @@ run(video_path) ...@@ -187,11 +185,10 @@ run(video_path)
> **参数** > **参数**
> >
> > - video_path (str): 输入视频文件路径。 > > - video_path (str): 输入视频文件路径。
> >
> >
> **返回值** >**返回值**
> >
> > - tuple(pred_img(np.array), out_paht(str)): 当属输入时图片时,返回预测后的图片,类型PIL.Image,以及图片的保存的路径。 >> - tuple(pred_img(np.array), out_paht(str)): 当属输入时图片时,返回预测后的图片,类型PIL.Image,以及图片的保存的路径。
> > - tuple(frame_path(str), out_path(str)): 当输入为视频时,frame_path为超分后视频每帧图片的保存路径,out_path为超分后的视频保存路径。 > > - tuple(frame_path(str), out_path(str)): 当输入为视频时,frame_path为超分后视频每帧图片的保存路径,out_path为超分后的视频保存路径。
### run_image ### run_image
...@@ -236,6 +233,14 @@ ppgan.apps.EDVRPredictor(output='output', weight_path=None) ...@@ -236,6 +233,14 @@ ppgan.apps.EDVRPredictor(output='output', weight_path=None)
> 构建RealSR实例。EDVR: Video Restoration with Enhanced Deformable Convolutional Networks,论文链接: https://arxiv.org/abs/1905.02716 ,是一个针对视频超分的模型。该接口,对视频做2倍的超分。建议视频使用mp4格式。 > 构建RealSR实例。EDVR: Video Restoration with Enhanced Deformable Convolutional Networks,论文链接: https://arxiv.org/abs/1905.02716 ,是一个针对视频超分的模型。该接口,对视频做2倍的超分。建议视频使用mp4格式。
> >
> *注意:目前该接口仅支持在静态图下使用,需在使用前添加如下代码开启静态图:
>
> ```
> import paddle
> paddle.enable_static() #开启静态图
> paddle.disable_static() #关闭静态图
> ```
>
> **示例** > **示例**
> >
> ``` > ```
...@@ -274,11 +279,19 @@ ppgan.apps.DAINPredictor(output='output', weight_path=None,time_step=None, use ...@@ -274,11 +279,19 @@ ppgan.apps.DAINPredictor(output='output', weight_path=None,time_step=None, use
> 构建插帧DAIN模型的实例。DAIN: Depth-Aware Video Frame Interpolation,论文链接: https://arxiv.org/abs/1904.00830 ,对视频做插帧,获得帧率更高的视频。 > 构建插帧DAIN模型的实例。DAIN: Depth-Aware Video Frame Interpolation,论文链接: https://arxiv.org/abs/1904.00830 ,对视频做插帧,获得帧率更高的视频。
> >
> *注意:目前该接口仅支持在静态图下使用,需在使用前添加如下代码开启静态图:
>
> ```
> import paddle
> paddle.enable_static() #开启静态图
> paddle.disable_static() #关闭静态图
> ```
>
> **示例** > **示例**
> >
> ``` > ```
> from ppgan.apps import DAINPredictor > from ppgan.apps import DAINPredictor
> dain = DAINPredictor() > dain = DAINPredictor(time_step=0.5) #目前 time_step 无默认值,需手动指定
> # 测试一个视频文件 > # 测试一个视频文件
> dain.run("docs/imgs/test.mp4") > dain.run("docs/imgs/test.mp4")
> ``` > ```
...@@ -313,7 +326,9 @@ run(video_path) ...@@ -313,7 +326,9 @@ run(video_path)
ppgan.apps.FirstOrderPredictor(output='output', weight_path=Noneconfig=None, relative=False, adapt_scale=Falsefind_best_frame=False, best_frame=None) ppgan.apps.FirstOrderPredictor(output='output', weight_path=Noneconfig=None, relative=False, adapt_scale=Falsefind_best_frame=False, best_frame=None)
``` ```
> 构建FirsrOrder模型的实例,此模型用来做Image Animation,即给定一张源图片和一个驱动视频,生成一段视频,其中主体是源图片,动作是驱动视频中的动作。论文是First Order Motion Model for Image Animation,论文链接: https://arxiv.org/abs/2003.00196 。 > 构建FirsrOrder模型的实例,此模型用来做Image Animation,即给定一张源图片和一个驱动视频,生成一段视频,其中主体是源图片,动作是驱动视频中的动作。
>
> 论文是First Order Motion Model for Image Animation,论文链接: https://arxiv.org/abs/2003.00196 。
> >
> **示例** > **示例**
> >
...@@ -354,11 +369,22 @@ run(source_image,driving_video) ...@@ -354,11 +369,22 @@ run(source_image,driving_video)
```pyhton ```pyhton
ppgan.apps.FaceParsePredictor(output_path='output') ppgan.apps.FaceParsePredictor(output_path='output')
``` ```
> 构建人脸解析模型实例,此模型用来做人脸解析, 即给定一个输入的人脸图像,人脸解析将为每个语义成分(如头发、嘴唇、鼻子、耳朵等)分配一个像素级标签。我们用BiseNet来完成这项任务。论文是 BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, 论文链接: https://arxiv.org/abs/1808.00897v1. > 构建人脸解析模型实例,此模型用来做人脸解析, 即给定一个输入的人脸图像,人脸解析将为每个语义成分(如头发、嘴唇、鼻子、耳朵等)分配一个像素级标签。我们用BiseNet来完成这项任务。
>
> 论文是 BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, 论文链接: https://arxiv.org/abs/1808.00897v1.
>
> *注意:此接口需要dlib包,使用前需用以下代码安装:
>
> ```
> pip install dlib
> ```
> Windows下安装此包时间可能过长,请耐心等待。
>
> **参数:** > **参数:**
> >
> > - input_image: 输入待解析的图片文件路径 > > - input_image: 输入待解析的图片文件路径
> > - output_path:输出保存的路径
> **示例:** > **示例:**
> >
...@@ -368,6 +394,7 @@ ppgan.apps.FaceParsePredictor(output_path='output') ...@@ -368,6 +394,7 @@ ppgan.apps.FaceParsePredictor(output_path='output')
> parser.run('docs/imgs/face.png') > parser.run('docs/imgs/face.png')
> ``` > ```
> **返回值:** > **返回值:**
>
> > - mask(numpy.ndarray): 返回解析完成的人脸成分mask矩阵, 数据类型为numpy.ndarray > > - mask(numpy.ndarray): 返回解析完成的人脸成分mask矩阵, 数据类型为numpy.ndarray
## ppgan.apps.AnimeGANPredictor ## ppgan.apps.AnimeGANPredictor
...@@ -375,7 +402,9 @@ ppgan.apps.FaceParsePredictor(output_path='output') ...@@ -375,7 +402,9 @@ ppgan.apps.FaceParsePredictor(output_path='output')
```pyhton ```pyhton
ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjust_brightness=True) ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjust_brightness=True)
``` ```
> 利用animeganv2来对景物图像进行动漫风格化。论文是 AnimeGAN: A Novel Lightweight GAN for Photo Animation, 论文链接: https://link.springer.com/chapter/10.1007/978-981-15-5577-0_18. > 利用AnimeGAN v2来对景物图像进行动漫风格化。
>
> 论文是 AnimeGAN: A Novel Lightweight GAN for Photo Animation, 论文链接: https://link.springer.com/chapter/10.1007/978-981-15-5577-0_18.
> **参数:** > **参数:**
> >
...@@ -389,6 +418,7 @@ ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjus ...@@ -389,6 +418,7 @@ ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjus
> predictor.run('docs/imgs/animeganv2_test.jpg') > predictor.run('docs/imgs/animeganv2_test.jpg')
> ``` > ```
> **返回值:** > **返回值:**
>
> > - anime_image(numpy.ndarray): 返回风格化后的景色图像 > > - anime_image(numpy.ndarray): 返回风格化后的景色图像
...@@ -398,7 +428,9 @@ ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjus ...@@ -398,7 +428,9 @@ ppgan.apps.AnimeGANPredictor(output_path='output_dir',weight_path=None,use_adjus
ppgan.apps.MiDaSPredictor(output=None, weight_path=None) ppgan.apps.MiDaSPredictor(output=None, weight_path=None)
``` ```
> 单目深度估计模型MiDaSv2, 参考 https://github.com/intel-isl/MiDaS, 论文是 Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer , 论文链接: https://arxiv.org/abs/1907.01341v3 > 单目深度估计模型MiDaSv2, 参考 https://github.com/intel-isl/MiDaS 单目深度估计是从单幅RGB图像中估计深度的方法
>
> 论文是 Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer , 论文链接: https://arxiv.org/abs/1907.01341v3
> **示例** > **示例**
> >
...@@ -431,37 +463,40 @@ ppgan.apps.MiDaSPredictor(output=None, weight_path=None) ...@@ -431,37 +463,40 @@ ppgan.apps.MiDaSPredictor(output=None, weight_path=None)
> > - weight_path (str): 指定模型路径,默认是None,则会自动下载内置的已经训练好的模型。 > > - weight_path (str): 指定模型路径,默认是None,则会自动下载内置的已经训练好的模型。
> **返回值:** > **返回值:**
>
> > - prediction (numpy.ndarray): 返回预测结果。 > > - prediction (numpy.ndarray): 返回预测结果。
> > - pfm_f (str): 如果设置output路径,返回pfm文件保存路径。 > > - pfm_f (str): 如果设置output路径,返回pfm文件保存路径。
> > - png_f (str): 如果设置output路径,返回png文件保存路径。 > > - png_f (str): 如果设置output路径,返回png文件保存路径。
## ppgan.apps.Wav2lipPredictor ## ppgan.apps.Wav2LipPredictor
```python ```python
ppgan.apps.FirstOrderPredictor() ppgan.apps.Wav2LipPredictor(face=None, ausio_seq=None, outfile=None)
``` ```
> 构建Wav2lip模型的实例,此模型用来做唇形合成,即给定一个人物视频和一个音频,实现人物口型与输入语音同步。论文是A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild,论文链接: http://arxiv.org/abs/2008.10010. > 构建Wav2Lip模型的实例,此模型用来做唇形合成,即给定一个人物视频和一个音频,实现人物口型与输入语音同步。
>
> 论文是A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild,论文链接: http://arxiv.org/abs/2008.10010.
> >
> **示例** > **示例**
> >
> ``` > ```
> from ppgan.apps import Wav2LipPredictor > from ppgan.apps import Wav2LipPredictor
> # The args parameter should be specified by argparse
> import ppgan
> predictor = Wav2LipPredictor() > predictor = Wav2LipPredictor()
> predictor.run(face, audio, outfile) > predictor.run('/home/aistudio/先烈.jpeg', '/home/aistudio/pp_guangquan_zhenzhu46s.mp4','wav2lip')
> ``` > ```
> **参数:** > **参数:**
> - args(ArgumentParser): 参数包含所有的输入参数,用户在运行程序时需要通过argparse指定,主要的参数主要包含以下几项:` > - face (str): 指定的包含人物的图片或者视频的文件路径。
> > - checkpoint_path (str): 指定模型路径,默认是None,不指定则会自动下载内置的已经训练好的模型。 > - audio_seq (str): 指定的输入音频的文件路径,它的格式可以是 `.wav`, `.mp3`, `.m4a`等,任何ffmpeg可以处理的文件格式都可以。
> > - face (str): 指定的包含人物的图片或者视频的文件路径。 > - outfile (str): 指定的输出视频文件路径。
> > - audio (str): 指定的输入音频的文件路径,它的格式可以是 `.wav`, `.mp3`, `.m4a`等,任何ffmpeg可以处理的文件格式都可以。
> > - outfile (str): 指定的输出视频文件路径。
>**返回值**
> >
> **返回值** >> 无。
>
> > 无。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册