feat: Metrics Support in `tritonfrontend` #7703

KrishnanPrash · 2024-10-15T18:19:58Z

What does the PR do?

Adding support for Metrics in tritonfrontend. This involves two components:

In tritonfrontend_pybind.cc, added bindings for HTTPMetricsServer
In tritonfrontend/_api/_metrics.py, added a Metrics class

With this PR, similar to KServeHttp and KServeGrpc, the metrics service can used with:

import tritonserver
from tritonfrontend import Metrics
server = tritonserver.Server(model_repostory=...).start(wait_until_ready=True)
metrics_service = Metrics(server)
metrics_service.start()
...
metrics_service.stop()

Additional Changes made in this PR:

Modified test functions documentation based on this comment
Removed extra parameter in request.post(...) based on this comment

Test plan:

Added 3 test function to L0_python_api:

test_metrics_default_port() : Tests whether the metrics service can start as expected
test_metrics_custom_port(): Tests whether arguments defined in tritonfrontend.Metrics.Options are passed successfully to HTTPMetrics
test_metrics_update(): Tests whether nv_inference_count value goes from 0 to 1 if inference request is performed.
CI Pipeline ID: 19197748

…l-endpoints

…' into kprashanth-tritonfrontend-metrics

qa/L0_python_api/test_model_repository/identity/config.pbtxt

src/http_server.h

src/python/tritonfrontend/_c/tritonfrontend_pybind.cc

…#7720)

src/python/tritonfrontend/_api/_metrics.py

rmccorm4 · 2024-10-23T22:10:47Z

docs/customization_guide/tritonfrontend.md

@@ -57,14 +57,18 @@ Note: `model_path` may need to be edited depending on your setup.

 2. Now, to start up the respective services with `tritonfrontend`
 ```python
-from tritonfrontend import KServeHttp, KServeGrpc
+from tritonfrontend import KServeHttp, KServeGrpc, Metrics


I don't love that the Metrics object is a web server, so it makes me wonder if we should rename these down the line, ex: KServeHttpService, MetricsService, etc

But I don't have a strong opinion on an alternative right now so I think it's fine, just mentioning for later. We will probably be restructing some packaging and naming in the near-mid future.

rmccorm4 · 2024-10-23T22:11:52Z

qa/L0_python_api/test_model_repository/identity/config.pbtxt

+  {
+    count: 1
+    kind : KIND_CPU


Did you ever investigate the gpu label thing?

Investigated a bit, but did not find the root cause. Will create a ticket in my backlog with hopefully a more consistent reproducer.

rmccorm4 · 2024-10-23T22:14:07Z

qa/L0_python_api/testing_utils.py

+        "tritonclient.http.InferenceServerClient",
+        "tritonclient.grpc.InferenceServerClient",


I think this type hint is correct, and the other places you use Union is missing InferenceServerClient. You could probably also use InferenceServerClientBase though it'd be a bit less strict.

rmccorm4 · 2024-10-23T22:14:28Z

qa/L0_python_api/testing_utils.py

-# Sends an inference to test_model_repository/identity model and verifies input == output.
-def send_and_test_inference_identity(frontend_client, url: str) -> bool:
+def send_and_test_inference_identity(
+    frontend_client: Union["tritonclient.http", "tritonclient.grpc"], url: str


See other comment on type hints, apply throughout

rmccorm4 · 2024-10-23T22:15:23Z

src/python/tritonfrontend/_api/__init__.py

+try:
+    from ._metrics import Metrics
+except ImportError:
+    # TRITON_ENABLE_Metrics=OFF


Make sure L0_build_variants passes

rmccorm4 · 2024-10-23T22:16:32Z

src/python/tritonfrontend/_api/_kservegrpc.py

    def __exit__(self, exc_type, exc_value, traceback):
        self.triton_frontend.stop()
        if exc_type:
-            raise ERROR_MAPPING[exc_type](exc_value) from None
+            raise exc_type(exc_value) from None


Same comment: #7720 (comment)

Why did this one keep the from None but the others didn't?

rmccorm4 · 2024-10-23T22:17:00Z

src/python/tritonfrontend/_api/_metrics.py

+    class Options:
+        address: str = "0.0.0.0"
+        port: int = Field(8002, ge=0, le=65535)
+        thread_count: int = Field(1, ge=0)


Can thread count be zero?

rmccorm4

Only minor comments - nice work!

rmccorm4 · 2024-10-24T15:42:17Z

src/python/tritonfrontend/_api/_metrics.py

+    class Options:
+        address: str = "0.0.0.0"
+        port: int = Field(8002, ge=0, le=65535)
+        thread_count: int = Field(1, ge=1)


Please fix lower bound in the http/kserve classes too

Should have added test case for 0 thread count, and probably for other numerical options

KrishnanPrash added 30 commits September 6, 2024 14:28

conditional http/grpc endpoints

59d9aa8

Adding tritonfrontend.whl to build.py instructions

0b66a27

Setting enable_tracing flag to 0

173216e

testing conditional appending of tracing lib

c9fa783

Merge remote-tracking branch 'origin/main' into kprashanth-conditiona…

384da14

…l-endpoints

linker error

40da2be

Working conditional builds w Tracing=ON

f08de59

Adding comments

f3fc425

Making top-level imports more specific

a657cae

Fixing imports

be36c42

Catching specfic error

80ee6e5

Merge remote-tracking branch 'origin/main' into kprashanth-conditiona…

723df84

…l-endpoints

Metrics Support

2e2108c

Working test_metrics_custom_port()

d7971b7

update docs with metrics support

23b4beb

Smoke tests for Metrics Bindings

6ebcdc9

casting float to int for same type comparison

1289f34

removing comment

ac8e23d

remove TODO comment

48acc3e

Cleaning up build.py and CMake

c6efa9e

Minimal working CMake configuration

2932600

updating identity model to use CPU only

73e1782

removing debug statements

1f09417

Merge remote-tracking branch 'origin/kprashanth-conditional-endpoints…

5e0df4e

…' into kprashanth-tritonfrontend-metrics

Updated documentation and removed TODO comments

43bd2ab

fixing spacing and removing unused imports

88a710d

removed todo comment

e0abc3b

moving to support library

85d7676

making tracing lib links public

8f0b4e1

Making comments consistent

7607884

KrishnanPrash added 2 commits October 18, 2024 12:13

CMake changes

246e380

pre-commit changes

80e88b5

KrishnanPrash requested review from GuanLuo and kthui October 18, 2024 22:30

GuanLuo reviewed Oct 22, 2024

View reviewed changes

qa/L0_python_api/test_model_repository/identity/config.pbtxt Outdated Show resolved Hide resolved

src/http_server.h Outdated Show resolved Hide resolved

src/python/tritonfrontend/_c/tritonfrontend_pybind.cc Show resolved Hide resolved

KrishnanPrash and others added 3 commits October 22, 2024 00:06

removing redundant parameters

acc9e9e

Spacing and comments

4402e3d

refactor: moving tritonfrontend to @handle_triton_error decorator (…

2712558

…#7720)

github-advanced-security bot found potential problems Oct 22, 2024

View reviewed changes

src/python/tritonfrontend/_api/_metrics.py Fixed Show fixed Hide fixed

KrishnanPrash added 2 commits October 22, 2024 14:28

removed unused import

8e7bba1

spacing

ba14f56

KrishnanPrash requested a review from GuanLuo October 23, 2024 16:27

GuanLuo previously approved these changes Oct 23, 2024

View reviewed changes

rmccorm4 reviewed Oct 23, 2024

View reviewed changes

src/python/tritonfrontend/_api/_metrics.py Outdated Show resolved Hide resolved

change default metrics thread count

f17dc17

KrishnanPrash dismissed GuanLuo’s stale review via f17dc17 October 23, 2024 19:54

KrishnanPrash requested a review from rmccorm4 October 23, 2024 19:56

rmccorm4 reviewed Oct 23, 2024

View reviewed changes

KrishnanPrash added 3 commits October 24, 2024 01:28

fixing type info

717ee47

fixing default and lower bound on thread count

d53db9b

making error throwing consistent

d2939cb

rmccorm4 reviewed Oct 24, 2024

View reviewed changes

Adding guards around frontend-specific code

46121ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Metrics Support in `tritonfrontend` #7703

feat: Metrics Support in `tritonfrontend` #7703

KrishnanPrash commented Oct 15, 2024 •

edited

Loading

rmccorm4 Oct 23, 2024

rmccorm4 Oct 23, 2024

KrishnanPrash Oct 23, 2024

rmccorm4 Oct 23, 2024

rmccorm4 Oct 23, 2024

rmccorm4 Oct 23, 2024

rmccorm4 Oct 23, 2024

rmccorm4 Oct 23, 2024

rmccorm4 left a comment

rmccorm4 Oct 24, 2024

GuanLuo Oct 24, 2024

		"tritonclient.http.InferenceServerClient",
		"tritonclient.grpc.InferenceServerClient",

feat: Metrics Support in tritonfrontend #7703

Are you sure you want to change the base?

feat: Metrics Support in tritonfrontend #7703

Conversation

KrishnanPrash commented Oct 15, 2024 • edited Loading

What does the PR do?

Test plan:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmccorm4 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feat: Metrics Support in `tritonfrontend` #7703

feat: Metrics Support in `tritonfrontend` #7703

KrishnanPrash commented Oct 15, 2024 •

edited

Loading