Adaptive batching batch timeout fires sooner than configured #1936

hanlaur · 2024-10-25T07:37:25Z

Hello,

In case of adaptive batching, if I define MLSERVER_MODEL_MAX_BATCH_TIME=1, expecting 1 second batching, the timeout is reached much sooner than it should be. Here is example (with debug print added before adding a request to batch to print the calculated timeout). The debug prints show that the timestamps of requests vs. the remaining timeout does not match. First request happens at 07:27:26,260 but ML server thinks it has reached the 1 second timeout at 07:27:26,386 i.e. in approx. 130ms.

remaining timeout 1.0 currently accumulated 0
remaining timeout 0.9999668598175049 currently accumulated 1
remaining timeout 0.9999086856842041 currently accumulated 2
2024-10-25 07:27:26,260 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,261 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9958555698394775 currently accumulated 3
remaining timeout 0.9917783737182617 currently accumulated 4
2024-10-25 07:27:26,263 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,264 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,266 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9834423065185547 currently accumulated 5
remaining timeout 0.9750831127166748 currently accumulated 6
remaining timeout 0.9666938781738281 currently accumulated 7
2024-10-25 07:27:26,267 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,269 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,271 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.95306396484375 currently accumulated 8
remaining timeout 0.939399003982544 currently accumulated 9
remaining timeout 0.9257209300994873 currently accumulated 10
2024-10-25 07:27:26,272 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,274 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,276 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9061038494110107 currently accumulated 11
remaining timeout 0.8864269256591797 currently accumulated 12
remaining timeout 0.8667359352111816 currently accumulated 13
2024-10-25 07:27:26,278 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.843052864074707 currently accumulated 14
remaining timeout 0.8193438053131104 currently accumulated 15
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.7896456718444824 currently accumulated 16
remaining timeout 0.7599077224731445 currently accumulated 17
2024-10-25 07:27:26,374 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.6421425342559814 currently accumulated 18
remaining timeout 0.524355411529541 currently accumulated 19
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.40399718284606934 currently accumulated 20
remaining timeout 0.2836179733276367 currently accumulated 21
2024-10-25 07:27:26,380 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,381 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.1606147289276123 currently accumulated 22
remaining timeout 0.03759360313415527 currently accumulated 23
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout -0.08817529678344727 currently accumulated 24
remaining timeout -0.21396827697753906 currently accumulated 25
2024-10-25 07:27:26,386 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer

Issue seems to be the timeout calculation here:

MLServer/mlserver/batching/adaptive.py

Line 139 in 1ce29e6

timeout = timeout - (current - start)

I believe the formula should be timeout = self._max_batch_time - (current - start).

The text was updated successfully, but these errors were encountered:

sakoush · 2024-10-25T08:22:01Z

@hanlaur many thanks for raising the issue and identifying the underlying problem. This makes sense.

Feel free to provide a fix and we will be happy to review it soon.

hanlaur · 2024-10-25T11:37:28Z

Feel free to provide a fix and we will be happy to review it soon.

Thanks @sakoush for quick response! I was hoping to avoid going through the CLA steps this time, this being just one variable name change... So I created issue instead of PR. I am wondering if you would have possibility to make the PR for this?

sakoush · 2024-11-12T17:43:57Z

@hanlaur hopefully we look into this in the next couple of weeks, we will also need to add testing.

hanlaur changed the title ~~Adaptive batching batch timeout does not work as expected.~~ Adaptive batching batch timeout fires sooner than configured Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive batching batch timeout fires sooner than configured #1936

Adaptive batching batch timeout fires sooner than configured #1936

hanlaur commented Oct 25, 2024

sakoush commented Oct 25, 2024

hanlaur commented Oct 25, 2024

sakoush commented Nov 12, 2024

Adaptive batching batch timeout fires sooner than configured #1936

Adaptive batching batch timeout fires sooner than configured #1936

Comments

hanlaur commented Oct 25, 2024

sakoush commented Oct 25, 2024

hanlaur commented Oct 25, 2024

sakoush commented Nov 12, 2024