Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive batching batch timeout fires sooner than configured #1936

Open
hanlaur opened this issue Oct 25, 2024 · 3 comments
Open

Adaptive batching batch timeout fires sooner than configured #1936

hanlaur opened this issue Oct 25, 2024 · 3 comments

Comments

@hanlaur
Copy link

hanlaur commented Oct 25, 2024

Hello,

In case of adaptive batching, if I define MLSERVER_MODEL_MAX_BATCH_TIME=1, expecting 1 second batching, the timeout is reached much sooner than it should be. Here is example (with debug print added before adding a request to batch to print the calculated timeout). The debug prints show that the timestamps of requests vs. the remaining timeout does not match. First request happens at 07:27:26,260 but ML server thinks it has reached the 1 second timeout at 07:27:26,386 i.e. in approx. 130ms.

remaining timeout 1.0 currently accumulated 0
remaining timeout 0.9999668598175049 currently accumulated 1
remaining timeout 0.9999086856842041 currently accumulated 2
2024-10-25 07:27:26,260 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,261 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9958555698394775 currently accumulated 3
remaining timeout 0.9917783737182617 currently accumulated 4
2024-10-25 07:27:26,263 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,264 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,266 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9834423065185547 currently accumulated 5
remaining timeout 0.9750831127166748 currently accumulated 6
remaining timeout 0.9666938781738281 currently accumulated 7
2024-10-25 07:27:26,267 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,269 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,271 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.95306396484375 currently accumulated 8
remaining timeout 0.939399003982544 currently accumulated 9
remaining timeout 0.9257209300994873 currently accumulated 10
2024-10-25 07:27:26,272 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,274 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,276 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9061038494110107 currently accumulated 11
remaining timeout 0.8864269256591797 currently accumulated 12
remaining timeout 0.8667359352111816 currently accumulated 13
2024-10-25 07:27:26,278 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.843052864074707 currently accumulated 14
remaining timeout 0.8193438053131104 currently accumulated 15
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.7896456718444824 currently accumulated 16
remaining timeout 0.7599077224731445 currently accumulated 17
2024-10-25 07:27:26,374 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.6421425342559814 currently accumulated 18
remaining timeout 0.524355411529541 currently accumulated 19
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.40399718284606934 currently accumulated 20
remaining timeout 0.2836179733276367 currently accumulated 21
2024-10-25 07:27:26,380 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,381 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.1606147289276123 currently accumulated 22
remaining timeout 0.03759360313415527 currently accumulated 23
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout -0.08817529678344727 currently accumulated 24
remaining timeout -0.21396827697753906 currently accumulated 25
2024-10-25 07:27:26,386 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer

Issue seems to be the timeout calculation here:

timeout = timeout - (current - start)

I believe the formula should be timeout = self._max_batch_time - (current - start).

@hanlaur hanlaur changed the title Adaptive batching batch timeout does not work as expected. Adaptive batching batch timeout fires sooner than configured Oct 25, 2024
@sakoush
Copy link
Member

sakoush commented Oct 25, 2024

@hanlaur many thanks for raising the issue and identifying the underlying problem. This makes sense.

Feel free to provide a fix and we will be happy to review it soon.

@hanlaur
Copy link
Author

hanlaur commented Oct 25, 2024

Feel free to provide a fix and we will be happy to review it soon.

Thanks @sakoush for quick response! I was hoping to avoid going through the CLA steps this time, this being just one variable name change... So I created issue instead of PR. I am wondering if you would have possibility to make the PR for this?

@sakoush
Copy link
Member

sakoush commented Nov 12, 2024

@hanlaur hopefully we look into this in the next couple of weeks, we will also need to add testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants