Support TRTLLM model in the benchmark script #442

nv-hwoo · 2023-11-29T23:49:35Z

Support TRTLLM (e.g. ensemble) model in the benchmark script
Use Triton vLLM backend and update the outdated doc

matthewkotila · 2023-11-29T23:51:07Z

Does this PR essentially include the work of #412?

nv-hwoo · 2023-11-29T23:57:45Z

@matthewkotila I guess it sort of does 😅 I forgot about that PR. The intention was to update the doc since the tutorial guide no longer builds its own vllm container and just relies on the vllm backend.

matthewkotila · 2023-11-30T00:42:52Z

@nv-hwoo: @matthewkotila I guess it sort of does 😅 I forgot about that PR. The intention was to update the doc since the tutorial guide no longer builds its own vllm container and just relies on the vllm backend.

That's totally ok if it does, I just wanted to know if the linked PR can be closed, that's all 🙏

debermudez · 2023-11-30T18:16:13Z

src/c++/perf_analyzer/docs/examples/profile.py

@@ -420,6 +420,13 @@ def profile(args, export_file):
        f"--input-data={INPUT_FILENAME} "
        f"--profile-export-file={export_file} "
    )
+    if args.model == "ensemble":  # TRT-LLM


Are these TRT-LLM specific options or options specific to any ensemble?

I think it's a way of detecting if the model is trtllm

absolutely but will any other model use ensemble as its top model?
feels like we are trying to use a reserved word as a variable type of thing.

Maybe others will? It's not an ideal way of detecting, hence my suggestion in here:

#442 (comment)

Added --backend argument to take in backend type as command line option.

What is the behavior when the wrong backend is used?

Good point. I think we should throw an error when user specifies unsupported backend type.

@debermudez Added the check.

src/c++/perf_analyzer/docs/examples/profile.py

debermudez · 2023-11-30T18:21:23Z

src/c++/perf_analyzer/docs/examples/profile.py

    return input_data


 def main(args):
-    input_data = construct_input_data(args)
+    if args.model == "ensemble":


do we want this to be trtllm?
do we have other backends that we plan to support that will also use ensemble?

It's a good point. Not ideal for detecting the model is trtllm. Better might be allowing user of profile.py to specify what backend they're using (vllm/trtllm)

I assumed that was the point of the -m option.
Is that not the case?

-m provides model name, that's different from what backend the model uses

i appreciate the clarification.
The terms are a bit overloaded so its great to get clarification.

i agree with your suggestion

nv-hwoo added 3 commits November 29, 2023 15:34

Support TRTLLM model and use vLLM backend

8ad47dd

Align spaces

2b1e17b

Move comment

514e9e1

nv-hwoo requested review from debermudez, matthewkotila and tgerdesnv November 29, 2023 23:49

nv-hwoo added 2 commits November 29, 2023 16:58

Specify shape of input tensors

0067a9d

Fix pre-commit hooks

6869902

matthewkotila mentioned this pull request Nov 30, 2023

Update LLM guide and profile.py to use new Triton+vLLM input names #412

Closed

debermudez requested changes Nov 30, 2023

View reviewed changes

nv-hwoo added 2 commits November 30, 2023 21:25

Fix metric error when there is only single response

07b1d74

Specify backend type for distinguishing input data

a5b5ae4

nv-hwoo requested a review from debermudez December 1, 2023 17:25

nv-hwoo added 2 commits December 1, 2023 09:30

Raise error when unknown backend specified.

7205793

Change to direct string comparison

dc5114a

debermudez approved these changes Dec 2, 2023

View reviewed changes

nv-hwoo merged commit c8c5f14 into main Dec 4, 2023
3 checks passed

nv-hwoo deleted the hwoo-support-trtllm branch December 4, 2023 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support TRTLLM model in the benchmark script #442

Support TRTLLM model in the benchmark script #442

nv-hwoo commented Nov 29, 2023

matthewkotila commented Nov 29, 2023

nv-hwoo commented Nov 29, 2023

matthewkotila commented Nov 30, 2023

debermudez Nov 30, 2023

matthewkotila Nov 30, 2023

debermudez Nov 30, 2023

matthewkotila Dec 1, 2023

nv-hwoo Dec 1, 2023

debermudez Dec 1, 2023

nv-hwoo Dec 1, 2023

nv-hwoo Dec 1, 2023

debermudez Nov 30, 2023

matthewkotila Nov 30, 2023

debermudez Nov 30, 2023

matthewkotila Dec 1, 2023

debermudez Dec 1, 2023

debermudez Dec 1, 2023

Support TRTLLM model in the benchmark script #442

Support TRTLLM model in the benchmark script #442

Conversation

nv-hwoo commented Nov 29, 2023

matthewkotila commented Nov 29, 2023

nv-hwoo commented Nov 29, 2023

matthewkotila commented Nov 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment