Unable to load image

[Update: Jannied :marseyitsover:] r/StableDiffusion: Comparison of DreamBooth and Textual Inversion (Meet Marsey!) :marseywave:

https://old.reddit.com/r/StableDiffusion/comments/xjlv19/comparison_of_dreambooth_and_textual_inversion

								

								

:#marseyclapping: @float-trip

Meet Marsey! An adorable cat from a Telegram sticker pack. I've been trying to get SD to generate more of this character, and wanted to share my results for anyone else working on a specific 2D style.

Comparisons


a photo of a spaceman Marsey in outer space

Textual Inversion / DreamBooth

https://i.rdrama.net/images/16841357703112879.webp / https://i.rdrama.net/images/16841357715438375.webp

a photo of Marsey as a lifeguard

Textual Inversion / DreamBooth

https://i.rdrama.net/images/16841357726722727.webp / https://i.rdrama.net/images/1684135773795822.webp

a photo of Marsey as a scientist

Textual Inversion / DreamBooth

https://i.rdrama.net/images/16841357749498625.webp / https://i.rdrama.net/images/16841357767355447.webp

a photo of Marsey as a gardener

Textual Inversion / DreamBooth

https://i.rdrama.net/images/16841357782806287.webp / https://i.rdrama.net/images/168413577961412.webp

What I've noticed:


Textual inversion:

DreamBooth:

  • Far, far better for my use case. The character is more editable and the composition improves. It doesn't match the art style quite as well, though.

  • 3 images worked better than 72

  • works extremely well with cross-attention prompt2prompt (the "img2img alternative test" script in automatic1111's UI)

  • 1,000 steps (30min on an A6000) is sufficient for good results

  • Worth mentioning - it's usable with deforum for animations

Combining the two doesn't seem to work, unfortunately. The next step might be either to directly finetune the network itself and apply one of these techniques afterwards, or possibly training the classifier.

86
Jump in the discussion.

No email address required.

So wait, dreambooth takes 30gb of VRAM to run, right - but does it spit out embeddings that you can use with Stable Diffusion, like Textual Inversion does? I hope someone rents a GPU and makes a big database website of popular characters and shit, especially if you could fetch that data from a stable diffusion client. That would be extremely useful.

Exciting times for AI, very nice marsey results btw :marseystars2:

(this is my 1000th comment!!!!! :marsoyhype:)

Jump in the discussion.

No email address required.

It gives you an entirely new 2gb model, so sadly it's pretty heavyweight. It might be possible to train multiple objects into one model in the future, though. I'm expecting all this to keep changing rapidly for a while

Jump in the discussion.

No email address required.



Now playing: In A Snow-Bound Land (DKC2).mp3

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.