diff --git a/docs/en_US/tutorials/motion_driving.md b/docs/en_US/tutorials/motion_driving.md index 04f2f203a9b0458074bb62bab9a80538fe745f97..b8c6719960db4bda0a2ec03510a9e9d2a3855244 100644 --- a/docs/en_US/tutorials/motion_driving.md +++ b/docs/en_US/tutorials/motion_driving.md @@ -9,7 +9,7 @@ ## Multi-Faces swapping -For photoes with multiple faces, we first detect all of the faces, then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video. +For photoes with multiple faces, we first detect all of the faces, then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video. Specific technical steps are shown below: @@ -17,14 +17,15 @@ Specific technical steps are shown below: 2. Use the First Order Motion model to do the facial expression transfer of each face 3. Put those "new" generated faces back to the original photo -At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more. +At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more. ## How to use - +### 1 Test for Face Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file. Note: for photoes with multiple faces, the longer the distances between faces, the better the result quality you can get. +- single face: ``` cd applications/ python -u tools/first-order-demo.py \ @@ -33,6 +34,14 @@ python -u tools/first-order-demo.py \ --ratio 0.4 \ --relative --adapt_scale ``` +- multi face +cd applications/ +python -u tools/first-order-demo.py \ + --driving_video ../docs/imgs/fom_dv.mp4 \ + --source_image ../docs/imgs/fom_source_image_multi_person.png \ + --ratio 0.4 \ + --relative --adapt_scale \ + --multi_person **params:** - driving_video: driving video, the motion of the driving video is to be migrated. @@ -40,6 +49,44 @@ python -u tools/first-order-demo.py \ - relative: indicate whether the relative or absolute coordinates of the key points in the video are used in the program. It is recommended to use relative coordinates. If absolute coordinates are used, the characters will be distorted after animation. - adapt_scale: adapt movement scale based on convex hull of keypoints. - ratio: The pasted face percentage of generated image, this parameter should be adjusted in the case of multi-person image in which the adjacent faces are close. The defualt value is 0.4 and the range is [0.4, 0.5]. +- multi_person: There are multi faces in the images. Default means only one face in the image + +### 2 Training +**Datasets:** +- fashion See[here](https://vision.cs.ubc.ca/datasets/fashion/) +- VoxCeleb See[here](https://github.com/AliaksandrSiarohin/video-preprocessing) + +**params:** +- dataset_name.yaml: Create a config of your own dataset + +- For single GPU: +``` +export CUDA_VISIBLE_DEVICES=0 +python tools/main.py --config-file configs/dataset_name.yaml +``` +- For multiple GPUs: +``` +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python -m paddle.distributed.launch \ + tools/main.py \ + --config-file configs/dataset_name.yaml + +``` + +**Example:** +- For single GPU: +``` +export CUDA_VISIBLE_DEVICES=0 +python tools/main.py --config-file configs/firstorder_fashion.yaml \ +``` +- For multiple GPUs: +``` +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python -m paddle.distributed.launch \ + tools/main.py \ + --config-file configs/firstorder_fashion.yaml \ +``` + **Online Tutorial running in AI Studio:** diff --git a/docs/en_US/tutorials/wav2lip.md b/docs/en_US/tutorials/wav2lip.md index 5c26d1a401cb08fd9ef4ddd790518e38d2107cf2..7c7ffbf00ee38d52c1175baa2c396041a41152dd 100644 --- a/docs/en_US/tutorials/wav2lip.md +++ b/docs/en_US/tutorials/wav2lip.md @@ -43,7 +43,6 @@ python tools/main.py --config-file configs/wav2lip.yaml ``` export CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch \ - --log_dir ./mylog_dd.log \ tools/main.py \ --config-file configs/wav2lip.yaml \ @@ -58,7 +57,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml ``` export CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch \ - --log_dir ./mylog_dd.log \ tools/main.py \ --config-file configs/wav2lip_hq.yaml \ diff --git a/docs/imgs/fom_source_image_multi_person.jpg b/docs/imgs/fom_source_image_multi_person.jpg new file mode 100644 index 0000000000000000000000000000000000000000..1799a0cb9935653a6b7ba26d0559bd62bf0172db Binary files /dev/null and b/docs/imgs/fom_source_image_multi_person.jpg differ diff --git a/docs/zh_CN/tutorials/motion_driving.md b/docs/zh_CN/tutorials/motion_driving.md index 1103830e67283ef7ecced2137e66452336d09b33..866fd1ad55ecce05a08f293c367678d3040ae623 100644 --- a/docs/zh_CN/tutorials/motion_driving.md +++ b/docs/zh_CN/tutorials/motion_driving.md @@ -25,13 +25,14 @@ First order motion model的任务是image animation,给定一张源图片, 同时,PaddleGAN针对人脸的相关处理提供[faceutil工具](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils),包括人脸检测、五官分割、关键点检测等能力。 ## 使用方法 - +### 1 人脸测试 用户可上传一张单人/多人照片与驱动视频,并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径,然后运行如下命令,即可完成单人/多人脸动作表情迁移,运行结果为命名为result.mp4的视频文件,保存在output文件夹中。 注意:使用多人脸时,尽量使用人脸间距较大的照片,效果更佳,也可通过手动调节ratio进行效果优化。 本项目中提供了原始图片和驱动视频供展示使用,运行的命令如下: +- 默认为单人脸: ``` cd applications/ python -u tools/first-order-demo.py \ @@ -40,12 +41,57 @@ python -u tools/first-order-demo.py \ --ratio 0.4 \ --relative --adapt_scale ``` +- 多人脸 +cd applications/ +python -u tools/first-order-demo.py \ + --driving_video ../docs/imgs/fom_dv.mp4 \ + --source_image ../docs/imgs/fom_source_image_multi_person.jpg \ + --ratio 0.4 \ + --relative --adapt_scale \ + --multi_person - driving_video: 驱动视频,视频中人物的表情动作作为待迁移的对象 - source_image: 原始图片,支持单人图片和多人图片,视频中人物的表情动作将迁移到该原始图片中的人物上 - relative: 指示程序中使用视频和图片中人物关键点的相对坐标还是绝对坐标,建议使用相对坐标,若使用绝对坐标,会导致迁移后人物扭曲变形 - adapt_scale: 根据关键点凸包自适应运动尺度 - ratio: 贴回驱动生成的人脸区域占原图的比例, 用户需要根据生成的效果调整该参数,尤其对于多人脸距离比较近的情况下需要调整改参数, 默认为0.4,调整范围是[0.4, 0.5] +- multi_person: 表示图片中有多张人脸,不加则默认为单人脸 + +### 2 训练 +**数据集:** +- fashion 可以参考[这里](https://vision.cs.ubc.ca/datasets/fashion/) +- VoxCeleb 可以参考[这里](https://github.com/AliaksandrSiarohin/video-preprocessing) + +**参数说明:** +- dataset_name.yaml: 需要配置自己的yaml文件及参数 + +- GPU单卡训练: +``` +export CUDA_VISIBLE_DEVICES=0 +python tools/main.py --config-file configs/dataset_name.yaml +``` +- GPU多卡训练: +``` +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python -m paddle.distributed.launch \ + tools/main.py \ + --config-file configs/dataset_name.yaml \ + +``` + +**例如:** +- GPU单卡训练: +``` +export CUDA_VISIBLE_DEVICES=0 +python tools/main.py --config-file configs/firstorder_fashion.yaml +``` +- GPU多卡训练: +``` +export CUDA_VISIBLE_DEVICES=0,1,2,3 +python -m paddle.distributed.launch \ + tools/main.py \ + --config-file configs/firstorder_fashion.yaml \ +``` **在线体验项目** diff --git a/docs/zh_CN/tutorials/wav2lip.md b/docs/zh_CN/tutorials/wav2lip.md index f6cf669f33c9782d7001255a16a042471e198c46..d2d2f6db26be83908d24bed677ff1047c47831f2 100644 --- a/docs/zh_CN/tutorials/wav2lip.md +++ b/docs/zh_CN/tutorials/wav2lip.md @@ -45,7 +45,6 @@ python tools/main.py --config-file configs/wav2lip.yaml ``` export CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch \ - --log_dir ./mylog_dd.log \ tools/main.py \ --config-file configs/wav2lip.yaml \ @@ -60,7 +59,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml ``` export CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch \ - --log_dir ./mylog_dd.log \ tools/main.py \ --config-file configs/wav2lip_hq.yaml \ diff --git a/ppgan/apps/first_order_predictor.py b/ppgan/apps/first_order_predictor.py index c267ce4855a8821c668d0471ddc335c77485e8bc..cfc7c62976d87d48d8d925ee4a3a8a5f238fef0b 100644 --- a/ppgan/apps/first_order_predictor.py +++ b/ppgan/apps/first_order_predictor.py @@ -146,6 +146,7 @@ class FirstOrderPredictor(BasePredictor): for im in reader: driving_video.append(im) except RuntimeError: + print("Read driving video error!") pass reader.close()