-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support TRTLLM model in the benchmark script #442
Conversation
Does this PR essentially include the work of #412? |
@matthewkotila I guess it sort of does 😅 I forgot about that PR. The intention was to update the doc since the tutorial guide no longer builds its own vllm container and just relies on the vllm backend. |
That's totally ok if it does, I just wanted to know if the linked PR can be closed, that's all 🙏 |
@@ -420,6 +420,13 @@ def profile(args, export_file): | |||
f"--input-data={INPUT_FILENAME} " | |||
f"--profile-export-file={export_file} " | |||
) | |||
if args.model == "ensemble": # TRT-LLM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these TRT-LLM specific options or options specific to any ensemble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a way of detecting if the model is trtllm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absolutely but will any other model use ensemble as its top model?
feels like we are trying to use a reserved word as a variable type of thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe others will? It's not an ideal way of detecting, hence my suggestion in here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added --backend
argument to take in backend type as command line option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the behavior when the wrong backend is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think we should throw an error when user specifies unsupported backend type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@debermudez Added the check.
return input_data | ||
|
||
|
||
def main(args): | ||
input_data = construct_input_data(args) | ||
if args.model == "ensemble": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want this to be trtllm?
do we have other backends that we plan to support that will also use ensemble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point. Not ideal for detecting the model is trtllm. Better might be allowing user of profile.py to specify what backend they're using (vllm/trtllm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that was the point of the -m option.
Is that not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-m
provides model name, that's different from what backend the model uses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i appreciate the clarification.
The terms are a bit overloaded so its great to get clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with your suggestion
ensemble
) model in the benchmark script