pixel2style2pixel.md 2.8 KB
Newer Older
H
Hecong Wu 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
# Pixel2Style2Pixel

## Pixel2Style2Pixel introduction

The task of Pixel2Style2Pixel is image encoding. It mainly encodes an input image as the style vector of StyleGAN V2 and uses StyleGAN V2 as the decoder.

<div align="center">
  <img src="../../imgs/pSp-teaser.jpg" width="500"/>
</div>

Pixel2Style2Pixel uses a fairly large model to encode images, and encodes the image into the style vector space of StyleGAN V2, so that the image before encoding and the image after decoding have a strong correlation.

Its main functions are:

- Convert image to hidden codes
- Turn face to face
- Generate images based on sketches or segmentation results
- Convert low-resolution images to high-definition images

At present, only the models of portrait reconstruction and portrait cartoonization are realized in PaddleGAN.

## How to use

### Generate

The user could use the following command to generate and select the local image as input:

```
cd applications/
W
wangna11BD 已提交
30
python -u tools/pixel2style2pixel.py \
H
Hecong Wu 已提交
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
       --input_image <YOUR INPUT IMAGE> \
       --output_path <DIRECTORY TO STORE OUTPUT IMAGE> \
       --weight_path <YOUR PRETRAINED MODEL PATH> \
       --model_type ffhq-inversion \
       --seed 233 \
       --size 1024 \
       --style_dim 512 \
       --n_mlp 8 \
       --channel_multiplier 2 \
       --cpu
```

**params:**
- input_image: the input image file path
- output_path: the directory where the generated images are stored
- weight_path: pretrained model path
- model_type: inner model type in PaddleGAN. If you use an existing model type, `weight_path` will have no effect.
  Currently available: `ffhq-inversion``ffhq-toonify`
- seed: random number seed
- size: model parameters, output image resolution
- style_dim: model parameters, dimensions of style z
- n_mlp: model parameters, the number of multi-layer perception layers for style z
- channel_multiplier: model parameters, channel product, affect model size and the quality of generated pictures
- cpu: whether to use cpu inference, if not, please remove it from the command

### Train (TODO)

In the future, training scripts will be added to facilitate users to train more types of Pixel2Style2Pixel image encoders.


## Results

Input portrait:

<div align="center">
    <img src="../../imgs/pSp-input.jpg" width="300"/> 
</div>

Cropped portrait-Reconstructed portrait-Cartoonized portrait:

<div align="center">
    <img src="../../imgs/pSp-input-crop.png" width="100"/>
    <img src="../../imgs/pSp-inversion.png" width="100"/>
    <img src="../../imgs/pSp-toonify.png" width="100"/> 
</div>

## Reference

```
@article{richardson2020encoding,
  title={Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation},
  author={Richardson, Elad and Alaluf, Yuval and Patashnik, Or and Nitzan, Yotam and Azar, Yaniv and Shapiro, Stav and Cohen-Or, Daniel},
  journal={arXiv preprint arXiv:2008.00951},
  year={2020}
}

```