Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Found a bug when walking through the shortfin llm docs using latest
nightly
sharktank. gguf is currently incompatible with numpy > 2. This breakssharktank.examples.export_paged_llm_v1
on linux.The gguf issue is filed here. It was closed from inactivity, but isn't actually solved and has a PR open for the fix.
Repro Steps
On linux:
Before re-pinning
Create a virtual environment:
Install depencies and sharktank:
Show numpy version (before re-pinning):
pip show numpy | grep Version Version: 2.1.3
Try running
export_paged_llm_v1
:python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4
You'll see this error:
After re-pinning
Create a virtual environment:
Install depencies and sharktank:
Show numpy version:
pip show numpy | grep Version Version: 1.26.3
Run
export_paged_llm_v1
:python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4
With re-pinning we get desired output: