Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed Palettization of SD 1.5 LCM model #326

Open
indoflaven opened this issue Apr 12, 2024 · 4 comments
Open

Mixed Palettization of SD 1.5 LCM model #326

indoflaven opened this issue Apr 12, 2024 · 4 comments

Comments

@indoflaven
Copy link

I'm trying to create a mixed palette version of a SD 1.5 LCM model. Specifically I've using /Lykon/dreamshaper-8-lcm from huggingface. I tried using the pregenerated recipe for SD 1.5 but got the following error when running the 4.85bit recipe:

File "/Users/michaelhein/Documents/GitHub/ml-stable-diffusion/python_coreml_stable_diffusion/mixed_bit_compression_apply.py", line 71, in main
    assert(pdist.min() < 0.01)
AssertionError

So I tried to create my own recipe and I get the following error:

 File "/opt/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 1231, in __call__
    hidden_states = F.scaled_dot_product_attention(
RuntimeError: Invalid buffer size: 20.25 GB

Perhaps I just need a system to more memory to complete this task (using an M1 Macbook Air with only 8gigs RAM), but thought I'd post here to if there's something I'm doing wrong.

@atiorh
Copy link
Collaborator

atiorh commented Apr 12, 2024

Thanks for the report @indoflaven! Yes, the matching of recipe results with the weights of a non-base model seems to be broken. I (not Apple) am working on a fix since I wrote this part of the code and we are actively working on an improved version of it that will include this fix. I will ping you once we release it but Apple might fix it before I do. cc: @aseemw

@indoflaven
Copy link
Author

@atiorh Thanks! Another question about mixed-bit palettization. Let's say I'm compressing a model like SD-Turbo which has a Unet, Vae Decoder, and Text Encoder. Can I only used mixed-bit palettization on the Unet? Or can I also run it on the Vae Decoder and Text Encoder? If not can I match a the mixed-bit Unet with a standard 6bit Vae and text encoder?

Also, using the mixed-bit palettization produces a mlpackage. How can I complile this in the same way using --bundle-resources-for-swift-cli compiles everything for the standard palettization.

Thanks!

@atiorh
Copy link
Collaborator

atiorh commented Apr 13, 2024

@atiorh Thanks! Another question about mixed-bit palettization. Let's say I'm compressing a model like SD-Turbo which has a Unet, Vae Decoder, and Text Encoder. Can I only used mixed-bit palettization on the Unet? Or can I also run it on the Vae Decoder and Text Encoder?

The implementation of MBP in this repo is tied to the Unet only. If you want to get your hands dirty through an example, check out how whisperkittools uses a generic implementation of MBP:

If not can I match a the mixed-bit Unet with a standard 6bit Vae and text encoder?

Yes, I recommend using 6-bit VAE + TextEncoder in conjunction with an MBP Unet. You just need to create them separately for now.

Also, using the mixed-bit palettization produces a mlpackage. How can I complile this in the same way using --bundle-resources-for-swift-cli compiles everything for the standard palettization.

coremlcompiler compile <path-to-mlpackage> .

@indoflaven
Copy link
Author

Thanks again. For anyone else coming to this post in the future xcrun coremlcompiler compile <path-to-mlpackage> . worked for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants