You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In case of adaptive batching, if I define MLSERVER_MODEL_MAX_BATCH_TIME=1, expecting 1 second batching, the timeout is reached much sooner than it should be. Here is example (with debug print added before adding a request to batch to print the calculated timeout). The debug prints show that the timestamps of requests vs. the remaining timeout does not match. First request happens at 07:27:26,260 but ML server thinks it has reached the 1 second timeout at 07:27:26,386 i.e. in approx. 130ms.
remaining timeout 1.0 currently accumulated 0
remaining timeout 0.9999668598175049 currently accumulated 1
remaining timeout 0.9999086856842041 currently accumulated 2
2024-10-25 07:27:26,260 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,261 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9958555698394775 currently accumulated 3
remaining timeout 0.9917783737182617 currently accumulated 4
2024-10-25 07:27:26,263 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,264 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,266 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9834423065185547 currently accumulated 5
remaining timeout 0.9750831127166748 currently accumulated 6
remaining timeout 0.9666938781738281 currently accumulated 7
2024-10-25 07:27:26,267 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,269 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,271 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.95306396484375 currently accumulated 8
remaining timeout 0.939399003982544 currently accumulated 9
remaining timeout 0.9257209300994873 currently accumulated 10
2024-10-25 07:27:26,272 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,274 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,276 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.9061038494110107 currently accumulated 11
remaining timeout 0.8864269256591797 currently accumulated 12
remaining timeout 0.8667359352111816 currently accumulated 13
2024-10-25 07:27:26,278 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.843052864074707 currently accumulated 14
remaining timeout 0.8193438053131104 currently accumulated 15
2024-10-25 07:27:26,282 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.7896456718444824 currently accumulated 16
remaining timeout 0.7599077224731445 currently accumulated 17
2024-10-25 07:27:26,374 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,375 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.6421425342559814 currently accumulated 18
remaining timeout 0.524355411529541 currently accumulated 19
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,378 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.40399718284606934 currently accumulated 20
remaining timeout 0.2836179733276367 currently accumulated 21
2024-10-25 07:27:26,380 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,381 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout 0.1606147289276123 currently accumulated 22
remaining timeout 0.03759360313415527 currently accumulated 23
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
2024-10-25 07:27:26,383 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
remaining timeout -0.08817529678344727 currently accumulated 24
remaining timeout -0.21396827697753906 currently accumulated 25
2024-10-25 07:27:26,386 [mlserver.grpc] INFO - /inference.GRPCInferenceService/ModelInfer
I believe the formula should be timeout = self._max_batch_time - (current - start).
The text was updated successfully, but these errors were encountered:
hanlaur
changed the title
Adaptive batching batch timeout does not work as expected.
Adaptive batching batch timeout fires sooner than configured
Oct 25, 2024
Feel free to provide a fix and we will be happy to review it soon.
Thanks @sakoush for quick response! I was hoping to avoid going through the CLA steps this time, this being just one variable name change... So I created issue instead of PR. I am wondering if you would have possibility to make the PR for this?
Hello,
In case of adaptive batching, if I define
MLSERVER_MODEL_MAX_BATCH_TIME=1
, expecting 1 second batching, the timeout is reached much sooner than it should be. Here is example (with debug print added before adding a request to batch to print the calculatedtimeout
). The debug prints show that the timestamps of requests vs. the remaining timeout does not match. First request happens at07:27:26,260
but ML server thinks it has reached the 1 second timeout at07:27:26,386
i.e. in approx. 130ms.Issue seems to be the timeout calculation here:
MLServer/mlserver/batching/adaptive.py
Line 139 in 1ce29e6
I believe the formula should be
timeout = self._max_batch_time - (current - start)
.The text was updated successfully, but these errors were encountered: