-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sharktank] Evaluation - Add Perplexity test for vmfb #306
Merged
Merged
Changes from 55 commits
Commits
Show all changes
65 commits
Select commit
Hold shift + click to select a range
07130b8
Get baseline_perplexity_scores from azure sharkpublic blob
archana-ramalingam cd21d75
Merge branch 'main' into perplexity-vmfb
archana-ramalingam ebe1e69
Add perplexity for vmfb
archana-ramalingam c6b9998
Merge branch 'perplexity-vmfb' of https://github.com/nod-ai/SHARK-Pla…
archana-ramalingam aa47d67
Add vmfb runner script
archana-ramalingam 1a7933a
Update test
archana-ramalingam 026318a
Rename perplexity torch test
archana-ramalingam a2d7c7a
Merge branch 'main' into perplexity-vmfb
archana-ramalingam 089f590
Revert npy to json
archana-ramalingam 1711f85
Update gguf to irpa
archana-ramalingam 74b376f
Add vmfb test
archana-ramalingam 6a9b5b3
Reduce tqdm progress print frequency
archana-ramalingam dfa3218
Add -s flag for pytest to display test progress
archana-ramalingam 0e83a2a
Merge main with branch
archana-ramalingam 7c85d0d
Update vmfb perplexity
archana-ramalingam 688d208
Merge branch 'main' into perplexity-vmfb
archana-ramalingam 26b48de
Address review comments
archana-ramalingam 4bb5857
Merge branch 'perplexity-vmfb' of https://github.com/nod-ai/SHARK-Pla…
archana-ramalingam 3945f37
Add export & compile tests
archana-ramalingam c9fa072
Update export test script
archana-ramalingam 7f4de96
Cleanup
archana-ramalingam 1a26ed7
Test export
archana-ramalingam 2725512
Update artifacts dir
archana-ramalingam d4d1d18
Add batch size
archana-ramalingam 3c22732
Merge main
archana-ramalingam 1f02051
Test export
archana-ramalingam 6190176
Remove artifacts dir
archana-ramalingam 9fe2c40
Remove export test and add as tool
archana-ramalingam cf6ee83
Add log messages
archana-ramalingam 9dbc07a
Add log messages
archana-ramalingam f5c4fef
Update vmfb runner module name dynamically
archana-ramalingam 3a91051
Update llallama3_8B_f16_decomposed_vmfb perplexities
archana-ramalingam 006c5d4
Move CI to mi300x-3
archana-ramalingam 7fe9594
Address review comments
archana-ramalingam 03baccb
Revert debug to info logging
archana-ramalingam 52a6fc1
Test
archana-ramalingam da04fd1
Merge branch 'main' into perplexity-vmfb
archana-ramalingam d1ed9a2
Update export mlir to remove tensor_parallelism_size arg
archana-ramalingam 8ab20e0
Merge branch 'perplexity-vmfb' of https://github.com/nod-ai/SHARK-Pla…
archana-ramalingam 1876f54
Make non_decomposed version the default
archana-ramalingam 8b274da
Merge branch 'main' into perplexity-vmfb
archana-ramalingam 563f72e
Fix export cmd string parsing issues
archana-ramalingam e58a10c
Merge branch 'perplexity-vmfb' of https://github.com/nod-ai/SHARK-Pla…
archana-ramalingam 4607fb2
Upgrade to latest iree to resolve dynamo error
archana-ramalingam 19e29d9
Add error handling if mlir export fails
archana-ramalingam 493feeb
Update perplexity scores
archana-ramalingam b65c882
test benchmark export
archana-ramalingam ea311e8
test benchmark export
archana-ramalingam b220688
Remove export tests
archana-ramalingam 2a79eda
Merge branch 'main' into perplexity-vmfb
archana-ramalingam 09796b7
Remove hardcoded paths
archana-ramalingam 8069f24
Xfail 405b as sharding vmfb is unsupported
archana-ramalingam fb78644
Update mi-300x-3 path
archana-ramalingam c3aa964
Test pytest command
archana-ramalingam 7d277d3
Test pytest command
archana-ramalingam 5f54084
Revert benchmarking test changes
archana-ramalingam 052f24a
Revert debug changes
archana-ramalingam a9227c7
Xfail 405b eager mode perplexity till sharding is fixed
archana-ramalingam 31aebbd
Add xfail to 405b as sharding needs to be fixed
archana-ramalingam 461034b
Final testing
archana-ramalingam 22da6e7
Fix CI test script
archana-ramalingam fe4988a
Remove CI debugging
archana-ramalingam fb7d720
Merge branch 'main' into perplexity-vmfb
archana-ramalingam e2c6c17
Remove dummy 405b vmfb baseline numbers
archana-ramalingam 59082ed
Merge branch 'perplexity-vmfb' of https://github.com/nod-ai/SHARK-Pla…
archana-ramalingam File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
name: Evaluation Tests | ||
|
||
on: | ||
pull_request: | ||
workflow_dispatch: | ||
schedule: | ||
# Weekdays nightly at 07:00 UTC = 23:00 PST / 00:00 PDT. | ||
|
@@ -15,13 +16,13 @@ concurrency: | |
cancel-in-progress: true | ||
|
||
jobs: | ||
test_perplexity: | ||
test_perplexity_vmfb: | ||
timeout-minutes: 1000 | ||
name: "Evaluation Tests - perplexity" | ||
name: "Evaluation Tests - perplexity_vmfb" | ||
strategy: | ||
matrix: | ||
version: [3.11] | ||
runs-on: [llama-mi300] | ||
runs-on: [llama-mi300x-3] | ||
fail-fast: false | ||
runs-on: ${{matrix.runs-on}} | ||
defaults: | ||
|
@@ -58,5 +59,61 @@ jobs: | |
-e "git+https://github.com/iree-org/iree-turbine.git#egg=iree-turbine" | ||
pip install --no-compile -r requirements.txt -r sharktank/requirements-tests.txt -e sharktank/ | ||
|
||
- name: Run perplexity test | ||
run: pytest -n 4 -v -s sharktank/tests/evaluate/perplexity_test.py --longrun | ||
# Try with the latest nightly releases, not what iree-turbine pins. | ||
# We could also pin to a known working or stable version. | ||
# This should eventually stabilize. Do the best we can for now. | ||
pip install -f https://iree.dev/pip-release-links.html --upgrade \ | ||
iree-compiler \ | ||
iree-runtime \ | ||
"numpy<2.0" | ||
- name: Run perplexity test with vmfb | ||
run: pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_vmfb_test.py --longrun --iree-device='hip://7' --iree-hip-target='gfx942' --llama3-8b-f16-model-path=/data/llama-3.1/8b/llama8b_f16.irpa --llama3-8b-tokenizer-path=/data/llama-3.1/8b/tokenizer_config.json | ||
|
||
# test_perplexity_torch: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. commented test? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like commented above will remove after testing vmfb version.
archana-ramalingam marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# timeout-minutes: 1000 | ||
# name: "Evaluation Tests - perplexity_torch" | ||
# strategy: | ||
# matrix: | ||
# version: [3.11] | ||
# runs-on: [llama-mi300] | ||
# fail-fast: false | ||
# runs-on: ${{matrix.runs-on}} | ||
# defaults: | ||
# run: | ||
# shell: bash | ||
# env: | ||
# PIP_CACHE_DIR: "${{ github.workspace }}/.pip-cache" | ||
# SHARK_PLATFORM_REPO_ROOT: ${{ github.workspace }} | ||
# steps: | ||
# - name: "Setting up Python" | ||
# id: setup_python | ||
# uses: actions/setup-python@v3 | ||
# with: | ||
# python-version: ${{matrix.version}} | ||
|
||
# - name: "Checkout Code" | ||
# uses: actions/checkout@v3 | ||
|
||
# - name: Cache Pip Packages | ||
# uses: actions/cache@v4 | ||
# id: cache-pip | ||
# with: | ||
# path: ${{ env.PIP_CACHE_DIR }} | ||
# key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('*requirements.txt') }} | ||
|
||
# - name: Install sharktank deps | ||
# run: | | ||
# python -m pip install --no-compile --upgrade pip | ||
# # Note: We install in three steps in order to satisfy requirements | ||
# # from non default locations first. Installing the PyTorch CPU | ||
# # wheels saves multiple minutes and a lot of bandwidth on runner setup. | ||
# pip install --no-compile -r pytorch-cpu-requirements.txt | ||
# pip install --no-compile -f https://iree.dev/pip-release-links.html --src deps \ | ||
# -e "git+https://github.com/iree-org/iree-turbine.git#egg=iree-turbine" | ||
# pip install --no-compile -r requirements.txt -r sharktank/requirements-tests.txt -e sharktank/ | ||
|
||
# - name: Run perplexity test in eager mode | ||
# run: pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_vmfb_test.py \ | ||
# --longrun \ | ||
# --llama3-8b-f16-model-path=/data/llama-3.1/8b/llama8b_f16.irpa \ | ||
# --llama3-8b-tokenizer-path=/data/llama-3.1/8b/tokenizer_config.json |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this temporary for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes