-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Implement Health Check Endpoint for Delayed Service Startup #764
Comments
v1/health_check endpoint might work for you. it should be implemented as a cores feature. could you try whether v1/health_check work for your case? |
@isaacncz |
…pea-project#764 Signed-off-by: Foong, Khang Sheong <[email protected]>
@isaacncz |
@louie-tsai i have tested the health check, it worked. However, for llm microservice, i will not be able to check whether the model is already downloaded completely. |
@isaacncz |
It depends on serving framework, we only know the service ready or not |
@kevinintel |
OS type
Ubuntu
Description
When running the example Translation using Docker Compose, one of the images takes additional time to pull a model from the Huggingface upon startup. During this period, the service is unresponsive to HTTP requests, resulting in HTTP 500 errors.
To improve reliability, would like to propose adding a health check endpoint that can verify when the service is ready to handle requests. This will allow other services and users to know when the service is up and running, avoiding unnecessary errors and improving the user experience.
Expected Behavior:
Docker Compose starts all services.
The service in question takes some time to pull the model.
A health check endpoint will be available to verify when the model has finished loading and the service is ready.
Proposed Solution:
Add a /health endpoint that returns:
200 OK when the service is fully operational.
503 Service Unavailable or similar status when the service is still initializing or loading the model.
Optionally, provide a message or a status code that indicates the estimated time remaining for startup.
The text was updated successfully, but these errors were encountered: