Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch SGL Benchmark Test for Pytest Dashboard #551

Merged
merged 2 commits into from
Nov 15, 2024

Conversation

stbaione
Copy link
Contributor

Description

The nightly SGLang benchmark tests had first successful run last night: https://github.com/nod-ai/shark-ai/actions/runs/11850084805/job/33024395622

And uploaded to dashboard successfully: https://nod-ai.github.io/shark-ai/llm/sglang/?sort=result

However, since I used a mock to pipe the bench_serving script output to logger.info, we ended up with results appearing in runner log, that did not appear in dashboard:

============ Serving Benchmark Result ============
INFO     __name__:mock.py:1189 Backend:                                 shortfin  
INFO     __name__:mock.py:1189 Traffic request rate:                    4         
INFO     __name__:mock.py:1189 Successful requests:                     10        
INFO     __name__:mock.py:1189 Benchmark duration (s):                  716.95    
INFO     __name__:mock.py:1189 Total input tokens:                      1960      
INFO     __name__:mock.py:1189 Total generated tokens:                  2774      
INFO     __name__:mock.py:1189 Total generated tokens (retokenized):    291       
INFO     __name__:mock.py:1189 Request throughput (req/s):              0.01      
INFO     __name__:mock.py:1189 Input token throughput (tok/s):          2.73      
INFO     __name__:mock.py:1189 Output token throughput (tok/s):         3.87      
INFO     __name__:mock.py:1189 ----------------End-to-End Latency----------------
INFO     __name__:mock.py:1189 Mean E2E Latency (ms):                   549509.25 
INFO     __name__:mock.py:1189 Median E2E Latency (ms):                 578828.23 
INFO     __name__:mock.py:1189 ---------------Time to First Token----------------
INFO     __name__:mock.py:1189 Mean TTFT (ms):                          327289.54 
INFO     __name__:mock.py:1[189](https://github.com/nod-ai/shark-ai/actions/runs/11850084805/job/33024395622#step:8:190) Median TTFT (ms):                        367482.31 
INFO     __name__:mock.py:1189 P99 TTFT (ms):                           367972.81 
INFO     __name__:mock.py:1189 -----Time per Output Token (excl. 1st token)------
INFO     __name__:mock.py:1189 Mean TPOT (ms):                          939.35    
INFO     __name__:mock.py:1189 Median TPOT (ms):                        886.13    
INFO     __name__:mock.py:1189 P99 TPOT (ms):                           2315.83   
INFO     __name__:mock.py:1189 ---------------Inter-token Latency----------------
INFO     __name__:mock.py:1189 Mean ITL (ms):                           732.59    
INFO     __name__:mock.py:1189 Median ITL (ms):                         729.43    
INFO     __name__:mock.py:1189 P99 ITL (ms):                            1477.77   
INFO     __name__:mock.py:1189 ==================================================

It also had a small bug that was obfuscated in runner/terminal logs, but appeared in dashboard, which prevented jsonl files from being generated after benchmark results were collected.

By fixing the bug in bench_serving input args, and logging resulting jsonl file after each run, I was able to verify locally that the output html contains the proper results:

image

Log jsonl file after run to ensure it appears in pytest-html output,
Fix arg that was preventing jsonl file from being generated
@stbaione stbaione changed the title Path SGL Benchmark Test for Pytest Dashboard Patch SGL Benchmark Test for Pytest Dashboard Nov 15, 2024
@stbaione stbaione merged commit 2d3bf36 into nod-ai:main Nov 15, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants