-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea, if we're being extra arty about videos. #48
Comments
Sounds fun. A bit like story mode, but more interactive. |
What inputs did you use for this outcome? Looks cool!! |
Thanks, this was "manually guided". I can't remember exactly what the prompt was, something along the lines of "a shiny metal robot face with glowing blue eyes", but I started with an initial image of a human skull with roughly drawn blue eyeballs (just two circles with a black blob in the middle for a pupil and a couple of highlights for reflections). Then on each call to checkin, I break and await user input. At that point I can check if the image is going how I want, and if it's not I can load it in Krita or Pinta or something and roughly "repair" any features that are not going quite as I like. Just a thick brush with a solid colour is usually sufficient, but I might also select an area and copy it, or stretch or rotate a section, or use Krita's Heal tool to erase a feature. It doesn't need any artistic skill. |
That's amazing! How did you do that? Could you share the code? Thanks! |
This looks very much like something I would use! It's a great idea either way! |
(Almost) all the code you need to do this is already in generate.py. The first thing I did was add another command line argument: vq_parser.add_argument("-jr", "--justrun", action="store_true", help="Just run, no breaks", dest="just_run") Next, I modified the main loop, so that if the try:
resetOptimizer = False
with tqdm() as pbar:
while True:
train(i)
if not args.just_run and i % args.display_freq == 0:
print(f"Modify output{i}.png and press Y, Enter, or just Enter if no change made")
y = input()
if y == 'Y':
img = Image.open(f"output{i}.png")
pil_image = img.convert('RGB')
pil_image = pil_image.resize((sideX, sideY), Image.LANCZOS)
pil_tensor = TF.to_tensor(pil_image)
z, *_ = model.encode(pil_tensor.to(device).unsqueeze(0) * 2 - 1)
z_orig = z.clone()
z.requires_grad_(True)
resetOptimizer = True If you want to run without waiting for input, you can pass |
Just a quick note: If you're using the -o command line option, |
Another change I've made for myself is to break every n iterations (after checkin) and await user input. If I input
Y
it reloads the image from disk and reinitialises the optimiser (the same as you do for a zoom video). This way I can "guide" it quite forcefully: if I want a skull with glowing blue eyes, and the blue eyes are not picked up from the init image (or have dissolved into nothing) by the 50th step, I can paint them in. I can also "promote" features in the output by exaggerating their presence.Since we're reinitialising the optimiser, we can presumably also switch up the prompts 'in the middle' of the run, when loss has 'stabilised'? Depending on how far you want to take this (and I'll be doing my own experimentation) maybe we can draw up a timeline and construct a video based on prompts that change over time.
The text was updated successfully, but these errors were encountered: