Deep Image Analogy is a technique to find semantically-meaningful dense correspondences between two input images. It adapts the notion of image analogy with features extracted from a Deep Convolutional Neural Network.
Deep Image Analogy is initially described in a SIGGRAPH 2017 paper
This is an reimplemention of Deep Image Analogy with C++ combined with CUDA. It is worth noticing that:
- The codes are based on Caffe.
- The codes only have been tested on Ubuntu with CUDA 8 or 7.5.
- The codes only support for machine with GPU, and have been tested on Nvidia GeoForce GTX 1080.
© Microsoft, 2017. Licensed under an BSD 2-Clause license.
If you find Deep Image Analogy (include deep patchmatch) helpful for your research, please consider citing:
@article{liao2017visual,
title={Visual Attribute Transfer through Deep Image Analogy},
author={Liao, Jing and Yao, Yuan and Yuan, Lu and Hua, Gang and Kang, Sing Bing},
journal={arXiv preprint arXiv:1705.01088},
year={2017}
}
- Linux and CUDA 8 or 7.5
- Install dependencies for building Caffe. Just follow the tutorial from Caffe.
- Use configuration script to download VGG19 Caffemodel by typing
sh scripts/config_deep_image_analogy.sh
. - Modify the CUDA path in
Makefile.config.example
and rename it toMakefile.config
. - Compile Caffe, make sure you installed all the dependencies before. Just type
make all
. - Compile deep_image_analogy by
sh scripts/make_deep_image_analogy.sh
. - Add libraries built by Caffe into
LD_LIBRARY_PATH
byexport LD_LIBRARY_PATH="./build/lib"
.
To run the codes for multiple images, the input datasets are required to be formed as follow steps:
-
Put the content images and style images into
./deep_image_analogy/images_content
and./deep_image_analogy/images_style
folders. -
Edit the configuration in
./script/generate_list.sh
and then runsudo sh ./script/generate_list.sh
to generate file lists. -
Finally, the datasets will be formed as follows:
--- datasets/images_content/*.jpg
--- datasets/images_style/*.jpg
--- datasets/content_list.txt
--- datasets/style_list.txt
Tips: The size of input image is limited, mostly should not be large than 700x500 if you use 1.0 for parameter ratio.
To run the demo, just type:
sh ./script/experiments_run_multi.sh
You need to set several parameters which have been mentioned in the paper. To be more specific, you need to set
- path_model, where the VGG-19 model is.
- path_A, the file list of input content images A.
- path_BP, the file list of input style images BP.
- path_output, the output path, will be created automatically.
- GPU Number, GPU ID you want to run this experiment.
- Ratio, the ratio to resize the inputs before sending them into the network.
- Blend Weight, the level of weights in blending process.
- Flag of WLS Filter, if you are trying to do photo style transfer, we recommend to switch this on to keep the structure of original photo.
- We often test images of size 600x400 and 448x448.
- We set ratio to 1.0 by default. Specifically, for face (portrait) cases, we find ratio = 0.5 often make the results better.
- Blend weight controls the result appearance. If you want the result to be more like original content photo, please increase it; if you want the result more faithful to the style, please reduce it.
- For the four applications, our settings are mostly (but not definitely):
- Photo to Style: blend weight=3, ratio=0.5 for face and ratio=1 for other cases.
- Style to Style: blend weight=3, ratio=1.
- Style to Photo: blend weight=2, ratio=0.5.
- Photo to Photo: blend weight=3, ratio=1.
Our codes acknowledge Eigen, PatchMatch, CudaLBFGS and Caffe. We also acknowledge to the authors of our image and style examples but we do not own the copyrights of them.