-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sharktank] Evaluation - Add Perplexity test #233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor optional changes.
…into perplexity-test
llama_8b_f16_gguf_path = "/data/extra/models/llama3.1_8B/llama8b_f16.gguf" | ||
llama_8b_f16_tokenizer_path = ( | ||
"/data/extra/models/llama3.1_8B/tokenizer_config.json" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests failed on this PR and after merge:
- https://github.com/nod-ai/SHARK-Platform/actions/runs/11359973189/job/31596941530
- https://github.com/nod-ai/SHARK-Platform/actions/runs/11360375745/job/31598023494
Any files required to run a test should be either
- included in the repository (if small enough)
- downloaded (and cached) on demand as part of the test
- downloaded (and cached) ahead of time via a script
As this is, this test will only run on a machine that has already run some unknown, undocumented setup steps.
- name: Run perplexity test | ||
run: pytest sharktank/tests/evaluate/perplexity_test.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If more tests are going in this category, we could use pytest marks or some other filtering to pick up the list of tests to run. As this is now, the new tests are running in multiple workflows (on every commit and here on a nightly schedule).
@@ -7,6 +7,7 @@ onnx==1.15.0 | |||
huggingface-hub==0.22.2 | |||
transformers==4.40.0 | |||
sentencepiece==0.2.0 | |||
datasets==3.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should steer towards putting requirements in the subproject requirements files instead of this top level file, especially if this is a test-only requirement
- Existing requirements file specific to sharktank: https://github.com/nod-ai/SHARK-Platform/blob/main/sharktank/requirements.txt
- Test requirements file for shortfin: https://github.com/nod-ai/SHARK-Platform/blob/main/shortfin/requirements-tests.txt
This reverts commit e30d0af.
Add Perplexity test for LLM evaluation